๐Ÿ“Š Course Introduction: Statistics & Data Science II

Jesper Lindmarker

๐Ÿ‘‹ About Me

๐Ÿฐ Before academiaโ€ฆ

I had a castle

๐Ÿฐ Before academiaโ€ฆ

And I wrote a cookbook

๐ŸŽ“ In academia

  • Bachelor in Management and Social science
    • One course in social anthropology was my gateway to social science.
  • Master in Demography at Stockholm university
  • For those who donโ€™t know, demography is the study of:
    • Fertility ๐Ÿคฐ๐Ÿป
    • Mortality โ˜ 
    • Migration ๐ŸšŒ
    • It was studying demography that made me fall in love with quantitative methods!
  • Now: PhD student in Analytical Sociology (final year) ๐Ÿ˜ฑ

โค๏ธ What I research

  • ๐Ÿ’ž Love and romance
  • ๐Ÿ’˜ Matchmaking โ€” or more specifically, assortative mating
  • ๐Ÿ” Patterns of partner choice
    • to understand social boundaries (โ€œusโ€ and โ€œthemโ€ groups)
  • ๐Ÿ“Š Register data โ€” large-scale population data covering the whole of Sweden

โค๏ธ What I research

๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆ Life outside academia

I juggle PhD life with three kids. Itโ€™s a balance of chaos, fun, and perspective.

โš“๏ธ What I do for fun

I sail, bake sourdough bread, climb, ski, play games, and play music.

๐Ÿซต Who are you?

๐Ÿ—“๏ธ Course Structure

  • Lectures + Labs: Usually Thursdays, full-day session (except week 5 and 8 where itโ€™s Tuesday)

    • ๐Ÿง  Morning: Lecture (theory, concepts, examples)
    • ๐Ÿงช Afternoon: Lab (hands-on, applied tasks, ask questions)
    • ๐Ÿ“ Each lab = assignment (Pass/Fail)
  • ๐Ÿ“† Rest of the week: Work on the lab report, study the material, strive to understand!

  • AI-log: One entry per week

  • Final exam: Sit-in exam, graded Fโ€“A

๐Ÿ“– How to learn in this class?

  1. ๐Ÿ“š Read the mandatory literature โ€“ It prepares you for the topics covered in lectures and labs.

  2. ๐ŸŽ“ Attend lectures and labs โ€“ They guide whatโ€™s examined. โ€“ If you skip them, make sure you master all assignments.

  3. ๐Ÿ’ป Work through the labs carefully โ€“ Theyโ€™re designed to teach you everything needed for the final exam.

  4. ๐Ÿค– Discuss with LLMs โ€“ Use them to deepen understanding and adapt concepts to your own learning style.

  5. ๐Ÿ’ฌ Come to teaching hours โ€“ Ask questions, clarify doubts, and connect ideas.

๐Ÿงฎ Grading

  • โœ… Lab assignments:
    • Pass/Fail
    • You must pass all of them to pass the course
    • I will mostly give you collective, not individual, feedback
    • If you fail a lab, donโ€™t worry, itโ€™s part of learning and you can resubmit
  • ๐Ÿค– AI log
    • Pass/Fail
  • ๐Ÿงช Final Exam:
    • Graded Fโ€“A

๐Ÿ“š Readings

  • ๐Ÿ”ฅ Mandatory readings are few and should be read before class.
  • ๐ŸŒฑ Optional readings offer extra depth and alternative perspectives.
  • ๐Ÿ“š Youโ€™re welcome to read the full course books, but at your own discretion.
  • ๐Ÿ“– The literature supports learning and it complements lectures and labs.

๐Ÿง‘โ€๐Ÿซ Teaching Hours

Got questions, stuck in code, or want to chat about regression?

๐Ÿ•™ Mondays, 10:00โ€“11:00 Iโ€™ll be available @Zoom

  • https://liu-se.zoom.us/my/lindmarker

๐Ÿ—๏ธ LISAM

https://liuonline.sharepoint.com/sites/Lisam_771A17_2025HT_7X/SitePages/TrainingHome.aspx

๐Ÿ“ˆ Course development and recent changes

  • The masterโ€™s programme has recently been revised.
    • Discrete Choice Modeling has been replaced by Text Analysis.
    • Meaning that DCM now needs to be covered in SDS-II
  • SDS-II has consistently been well appreciated by students
    (mean evaluation 2024: 4.52 / 5) Evaliuate link
  • However, student feedback highlighted that:
    • Literature seminars were not as effective as other parts of the course.
    • Labs were appreciated but could be more challenging and applied.
    • Some students felt unsure about the basic logic behind regression models.
  • Teachers in later courses also noted a need for stronger methodological foundations.

๐Ÿ”„ Main adjustments

  1. Greater focus on core regression ideas โ€” understanding, interpretation, and application
  2. Slightly reduced focus on causal inference, to strengthen fundamentals first
  3. Selected Discrete Choice Modeling content integrated.
  4. Seminars replaced with more applied and interactive lab-based activities
    • Stronger link between lectures and labs
    • More emphasis on coding, interpretation, and hands-on practice
    • Seminar content now in Contemporary research

๐ŸŒ A Very Diverse Class

  • We all come from different backgrounds, having studied in different fields.
  • Some of you may find the content easy; some may find them hard
  • I encourage those who get it quickly to help those who donโ€™t
  • ๐Ÿ™Œ Teaching someone else is the best way to understand deeply
  • If you find the material too easy, I can point you too additional material.

๐Ÿง  What We Focus On

  • Not a lot of math, but some.
  • Interpretation
  • Application

๐Ÿง  Learn Mathematical Notation

As I said โ€” we wonโ€™t do a lot of math.

But we will use notation. And it helps to get comfy with it.

๐Ÿ‘‰ Donโ€™t be scared off by Greek letters or funny symbols: \[ \text{logit}(p_i) = \beta_0 + \sum_{j=1}^k \beta_j x_{ij} \]

โœ๏ธ Some examples

\(x\) or \(y\) โ€“ a variable or value

\(\bar{x}\) โ€“ the mean (average) of \(x\)

\(n\) โ€“ number of observations

\(\sum\) โ€“ summation: e.g \(\sum_{i=1}^n x_i\) means โ€œadd all the \(x_i\) values from \(i = 1\) to \(n\)โ€

\(x_i\) โ€“ value of variable x for observation (row) i

\(x_{ij}\) โ€“ value in row i, column j of a data matrix

\(\widehat{y}\) โ€“ โ€œy-hatโ€ = estimated or predicted value

\(\alpha, \beta, \gamma, \delta, \varepsilon\)โ€ฆ โ€“ Greek letters we assign meaning to

\(\mathbb{E}[X]\) โ€“ expected value of \(X\)

\(f(x)\) โ€“ a function of x (a rule that maps inputs to outputs)

\(y = a + b x\) โ€“ a straight line

๐Ÿงฉ Why This Matters

  • Notation \(=\) universal language of math and stats
  • Helps you understand books, slides, articles, even ChatGPT explanations
  • Once you learn it, you stop seeing it as scary and start seeing it as helpful

What are your questions?

See you tomorrow!