Política Brasileira entry #3
I keep on cleaning the dataset and applying functions to further prepare the file for model deployment. Today I've categorised the column with the reasons for the spend(I've group them in 6 different columns) and made dummies out of them. This will help the regression models to predict and overall spent for following years. Next, I'm gonna try to fill the missing values with the correct information (the first 50 entries are missing congressman name and party, together with the state where they're from) which is a piece of information I'll have to get from the API. Once that's done, I'm gonna upload the file into Tableau to start visualising it. Once I have that idea, I'm gonna "functionalise" the script to pass on all of the files from previous years. I'm gonna have to keep in mind that the congress changes every four years, so I'm gonna analyse by "time-chunks"; that is, how did a congressman has evolved on its spending. That's the last step before presenting conclusions - I need to think on how I'm gonna create a target variable to be able to predict, as I plan to deploy the model as a beta version of the project.