Coding for Data Journalism

Coding for Data Journalism is a critical skill in the Graduate Certificate in Data Journalism program. This section will explain key terms and vocabulary that are essential to understanding the concepts and techniques used in coding for dat…

Coding for Data Journalism

Coding for Data Journalism is a critical skill in the Graduate Certificate in Data Journalism program. This section will explain key terms and vocabulary that are essential to understanding the concepts and techniques used in coding for data journalism.

1. Coding: Coding is the process of creating instructions for computers to execute. In data journalism, coding is used to extract, clean, analyze, and visualize data. Programming languages such as Python, R, and JavaScript are commonly used in data journalism. 2. Data Journalism: Data journalism is a form of journalism that uses data to tell stories. Data journalists gather, analyze, and visualize data to uncover insights and trends that can be used to inform the public. 3. Data: Data is a collection of facts, statistics, and information that can be analyzed to reveal patterns, trends, and insights. Data can be collected from various sources, such as government databases, surveys, and social media platforms. 4. Python: Python is a popular programming language used in data journalism. It is an open-source language that is easy to learn and has a large community of users. Python is used for data cleaning, analysis, and visualization. 5. R: R is a programming language used for statistical analysis and visualization. It is a popular language in data journalism due to its powerful data analysis capabilities. R is used for data cleaning, analysis, and visualization. 6. JavaScript: JavaScript is a programming language used for web development. It is used in data journalism to create interactive data visualizations and graphics. 7. Data Cleaning: Data cleaning is the process of identifying and correcting errors, inconsistencies, and inaccuracies in data. Data cleaning is an essential step in data journalism as it ensures that the data used for analysis and visualization is accurate and reliable. 8. Data Analysis: Data analysis is the process of examining data to uncover insights and trends. Data analysis can be done using various techniques, such as statistical analysis, machine learning, and data mining. 9. Data Visualization: Data visualization is the process of creating visual representations of data to help communicate insights and trends. Data visualization can be done using various tools, such as Tableau, D3.js, and Matplotlib. 10. APIs: APIs (Application Programming Interfaces) are sets of rules and protocols that allow different software applications to communicate with each other. APIs are used in data journalism to extract data from various sources, such as government databases, social media platforms, and APIs provided by companies. 11. SQL: SQL (Structured Query Language) is a programming language used for managing and manipulating databases. SQL is used in data journalism to extract data from databases, filter data, and join tables. 12. JSON: JSON (JavaScript Object Notation) is a data format used for data interchange between software applications. JSON is used in data journalism to extract data from APIs and web services. 13. Git: Git is a version control system used for managing code repositories. Git is used in data journalism to manage code and collaborate with other data journalists. 14. Jupyter Notebook: Jupyter Notebook is an open-source web application used for data analysis and visualization. Jupyter Notebook allows users to create and share documents that contain live code, equations, visualizations, and narrative text. 15. Pandas: Pandas is a Python library used for data manipulation and analysis. Pandas provides data structures and functions for cleaning, transforming, and analyzing data. 16. NumPy: NumPy is a Python library used for numerical computing. NumPy provides arrays and matrices for efficient data manipulation and analysis. 17. Matplotlib: Matplotlib is a Python library used for data visualization. Matplotlib provides functions for creating static, animated, and interactive visualizations. 18. Seaborn: Seaborn is a Python library used for statistical data visualization. Seaborn provides functions for creating heatmaps, distribution plots, and regression plots. 19. D3.js: D3.js is a JavaScript library used for data visualization. D3.js provides functions for creating interactive and dynamic visualizations. 20. Challenge: A challenge is a task or problem that requires the application of coding and data journalism skills. Challenges can be used to practice and improve coding and data journalism skills.

Example: Suppose a data journalist wants to analyze the trend of COVID-19 cases in a particular state. The data journalist can use Python to extract data from a government database or API, clean the data using Pandas, analyze the data using NumPy and Matplotlib, and visualize the data using Seaborn. The data journalist can also use Jupyter Notebook to create a document that contains live code, equations, visualizations, and narrative text.

In this example, the data journalist uses various coding and data journalism skills, such as data extraction, data cleaning, data analysis, and data visualization. The data journalist also uses various tools and libraries, such as Python, Pandas, NumPy, Matplotlib, and Seaborn, to perform these tasks.

Challenge: Create a Python script that extracts data from a government database or API, cleans the data using Pandas, analyzes the data using NumPy, and visualizes the data using Matplotlib. The script should create a bar chart that shows the top 10 states with the highest number of COVID-19 cases.

Conclusion: Coding for Data Journalism is a critical skill in the Graduate Certificate in Data Journalism program. Understanding key terms and vocabulary, such as coding, data journalism, data, Python, R, JavaScript, data cleaning, data analysis, data visualization, APIs, SQL, JSON, Git, Jupyter Notebook, Pandas, NumPy, Matplotlib, Seaborn, and D3.js, is essential for success in data journalism. By practicing and improving coding and data journalism skills, data journalists can uncover insights and trends that can be used to inform the public.

Key takeaways

  • This section will explain key terms and vocabulary that are essential to understanding the concepts and techniques used in coding for data journalism.
  • APIs are used in data journalism to extract data from various sources, such as government databases, social media platforms, and APIs provided by companies.
  • The data journalist can use Python to extract data from a government database or API, clean the data using Pandas, analyze the data using NumPy and Matplotlib, and visualize the data using Seaborn.
  • In this example, the data journalist uses various coding and data journalism skills, such as data extraction, data cleaning, data analysis, and data visualization.
  • Challenge: Create a Python script that extracts data from a government database or API, cleans the data using Pandas, analyzes the data using NumPy, and visualizes the data using Matplotlib.
  • Understanding key terms and vocabulary, such as coding, data journalism, data, Python, R, JavaScript, data cleaning, data analysis, data visualization, APIs, SQL, JSON, Git, Jupyter Notebook, Pandas, NumPy, Matplotlib, Seaborn, and D3.
May 2026 intake · open enrolment
from £90 GBP
Enrol