This talk demonstrates methods for optimizing big data for analysis using Python and the Pandas, Matplotlib, SciPy, and Seaborn libraries. Subsets of raw data will be provided prior to talk in Github or Gitlab repositories. Talk will discuss importance of data cleansing and code reusability for Big Data analysis and visualization. Line and bar charts displaying Big Data processing results will be produced.
Talk will focus on running code from Jupyter Notebook to provide examples of building and debugging scripts. Attendees have the option to run scripts through Jupyter Notebook. Discussion of importance of analyzing and visualizing Big Data to produce insights for local government will follow.