Year
2022Credit points
10Campus offering
No unit offerings are currently available for this unitPrerequisites
Nil
Incompatible
ITEC102 Python Fundamentals For Data Science
Unit rationale, description and aim
Data is deemed as the world’s ‘new oil’ while data science is a new inter-disciplinary science of data that employs scientific methods, algorithms, tools and systems for uncovering insights, knowledge and value from massive data generated in different domains. Python, a general-purpose programming language, has gradually become the ‘engine’ of data and data science. In particular, many data scientists use Python because it provides a wealth of data science tools and libraries.
This unit will cover fundamental elements of python programming language and its comprehensive use in the context of data science. This includes Python language basics, data structures, functions, files, tools, and various python data science libraries for data processing, analysis and visualisation. Data ethics and elementary statistics and probability in data science will also be introduced. The aim of the unit is to learn how Python can be used for building data science solutions.
Learning outcomes
To successfully complete this unit you will be able to demonstrate you have achieved the learning outcomes (LO) detailed in the below table.
Each outcome is informed by a number of graduate capabilities (GC) to ensure your work in this, and every unit, is part of a larger goal of graduating from ACU with the attributes of insight, empathy, imagination and impact.
Explore the graduate capabilities.
On successful completion of this unit, students should be able to:
LO1 - Demonstrate an understanding of fundamental python programming language and data science concepts and tools (GA5, GA8)
LO2 - Demonstrate the use of common python data science libraries and tools for data collection, cleaning, and wrangling (GA5, GA10)
LO3 - Experiment with data processing, analysis and visualisation techniques and tools to solve real-world data science problems (GA4, GA5)
LO4 - Evaluate data science ethical issues as they impact on human dignity and privacy (GA3, GA5)
Graduate attributes
GA3 - apply ethical perspectives in informed decision making
GA4 - think critically and reflectively
GA5 - demonstrate values, knowledge, skills and attitudes appropriate to the discipline and/or profession
GA8 - locate, organise, analyse, synthesise and evaluate information
GA10 - utilise information and communication and other relevant technologies effectively.
Content
Topics will include:
- Data science and python introduction
- Data science environment setup: Google CoLab, IPython, Jupyter notebooks and IDEs
- Python language syntax, semantics and scalar types
- Python language control flow and basic data structures and sequences
- Python language functions and files
- NumPy and Pandas
- Data processing on data loading, storage, and file formats
- Data processing on data cleaning and preparation
- Data processing on data wrangling: join, combine, and reshape
- Data processing on data aggregation and group operations
- Data plotting, visualisation and exploratory data analysis
- Data ethics and potential adverse impacts.
Learning and teaching strategy and rationale
Mode of delivery: This unit is offered mainly in ‘Attendance mode’ with aspects of ‘Multi-mode’ incorporated into the delivery to maximise the learning support offered to students. Students will be required to attend face-to-face workshops in specific physical location/s (including supervised lab practical sessions) and have face-to-face interactions with teaching staff to further their achievement of the learning outcomes. This unit is also structured with some required upfront preparation before workshops – learning materials and tasks set via online learning platforms. This will provide multiple forms of preparatory and practice opportunities for students to prepare and revise.
Further to this, to ensure students are ready to transition from the Diploma and articulate into the second year of undergraduate study, transition pedagogies will be incorporated into the unit as the key point of differentiation from the standard unit. This focuses on an active and engaging approach to learning and teaching practices, and a scaffolded approach to the delivery of curriculum to enhance student learning in a supportive environment. This will ensure that students develop foundation level discipline-based knowledge, skills and attributes, and simultaneously the academic competencies required of students to succeed in this unit.
Students should anticipate undertaking 150 hours of study for this unit, including class and lab attendance, readings, online forum participation and assessment preparation.
Assessment strategy and rationale
A range of assessment procedures will be used to meet the unit learning outcomes and develop graduate attributes consistent with University assessment requirements. The first assessment consists of small to medium sized python setup and programming tasks. The purpose is to assess students’ fundamental Python programming and data science skills for problem solving. The second assessment consists of data preparation tasks using key Python data science ecosystem/libraries. The purpose is to assess students’ use of Python data science libraries NumPy and Pandas and other related tools for collecting, cleaning and wrangling various types of data. The final assessment is a more comprehensive assignment involving data processing, analysis and visualisation. The purpose is to assess students’ Python programming and data science techniques from data processing to data visualisation on real-world datasets with consideration of data ethics. There are fortnightly lab sessions associated with the assessments including assessable lab participation/engagement.
Strategies aligned with transition pedagogies will be utilised to facilitate successful completion of the unit assessment tasks. For each assessment, there will be the incorporation of developmentally staged tasks with a focus on a progressive approach to learning. This will be achieved through activities, including regular feedback, particularly early in the unit of study to support their learning; strategies to develop and understand discipline-specific concepts and terminology; in-class practice tasks with integrated feedback; and greater peer-to-peer collaboration.
The assessments for this unit are designed to demonstrate the achievement of each learning outcome. To pass this unit, students are required to obtain an overall mark of at least 50%.
Overview of assessments
Brief Description of Kind and Purpose of Assessment Tasks | Weighting | Learning Outcomes | Graduate Attributes |
---|---|---|---|
Assessment 1: Programming tasks The first assessment item consists tasks of Python environment setup and solving simple Python programming and data science problems. The assessment requires students to demonstrate their understanding and use of fundamental Python programming and data science skills Submission Type: Individual Assessment Method: Content knowledge coding tasks Artefact: Code | 30% | LO1 | GA5, GA8 |
Assessment 2: Data preparation lab practical with NumPy and Pandas The second assessment item is a data preparation practical using key python data science ecosystem/libraries. The assessment requires students to use libraries NumPy and Pandas and other related tools for collecting, cleaning and wrangling various types of data. Submission Type: Individual Assessment Method: Conceptual knowledge coding tasks Artefact: Code | 30% | LO2 | GA5, GA10 |
Assessment 3: Data processing, analysis and visualisation assignment The final assessment is a more comprehensive assignment involving data processing, analysis and visualisation. The assignment requires students to demonstrate python data science techniques from data processing to data visualisation on real-world datasets with consideration of data ethics. Submission Type: Individual Assessment Method: Applying knowledge coding tasks Artefact: Code | 40% | LO3, LO4 | GA3, GA4, GA5 |
Representative texts and references
Bruce P, Bruce A & Gedeck P 2020, Practical Statistics for Data Scientists: 50+ Essential Concepts Using R and Python, 2nd edn, O'Reilly Media, Inc, USA.
Downey, AB 2015, Think Stats, 2nd edn, O'Reilly Media, Inc, USA.
Grus, J 2019, Data Science from Scratch, 2nd edn, O'Reilly Media, Inc, USA.
McKinney, W 2018, Python for Data Analysis, 2nd edn, O'Reilly Media, Inc, USA.
Massaron, L & Mueller JP 2019, Python for Data Science: For Dummies, 2nd edn, John Wiley & Sons, Hoboken.
Matthes, E 2019, Python Crash Course: A Hands-On, Project-Based Introduction to Programming, 2nd edn, No Starch Press.