DATA1001: Foundations of Data Science

University of Sydney 2024 S2
Completed
logo
R
Statistics
Visualization
Hypothesis Testing

Built statistical thinking from study design to hypothesis testing, using base R and ggplot2.

Learning Outcomes

  • Statistical Foundations: Articulated the role of statistics in society, with emphasis on ethical use, privacy, and big-data challenges.
  • Study Design & Interpretation: Evaluated how sampling and experimental design influence conclusions and limitations of data analysis.
  • Data Summarization & Visualization: Produced and interpreted graphical & numerical summaries using base R and ggplot2.
  • Probability & Inference: Applied normal approximation and box models to describe chance variation and measurement error.
  • Modeling Relationships: Built and explained linear regression models to analyze relationships between variables.
  • Hypothesis Testing: Formulated hypotheses, ran appropriate tests, interpreted p-values while avoiding common pitfalls.
  • Critical Thinking: Assessed bias, confounding, and misuse of statistics in media and published research.
  • Team-Based Exploration: Delivered collaborative analyses via reproducible reports and oral presentations.

Takeaways

This course was a true turning point for me. At the beginning, I knew almost nothing about R or statistics, but step by step I learned how to use EDA, frame clear hypotheses, and apply tests to validate them. Our team project was recognized as one of the top 5 among more than 800 peers, and later I also earned a top 5 spot in the individual project. What made this journey meaningful was not just the grades, but the growth: learning how to clean and interpret data systematically, asking the right questions as if I were consulting a client, and building confidence in applying statistical reasoning to real problems. R pushed me to think in tidy pipelines, while the statistical foundation gave me tools to question assumptions and defend conclusions responsibly. Looking back, I see this course as more than a class—it was the moment I discovered how data and statistics could become a lens for problem-solving, and how persistence can transform uncertainty into clarity.