Target Audience
The course is intended to professional Engineers, Project Managers, Data Analysts and Computer Scientists with responsibilities in new product design, planning and development, or current product quality management who wish to develop advanced knowledge and skills in the application of statistical methods in support of robust data-based decision-making.
Prerequisites
- technical English suitable for regular professional use
- basic previous coding experience is required (variables, controls, lists...)
- some experience with Data Analysis / Big Data concerns in the workplace is expected as a use case to study is needed
- attendance of "Introduction to Python with Google Colab" module (
M00
) is recommended
- attendance of "Statistics for Engineering" module (
M01
) is recommended
- or equivalent knowledge of basic concepts of statistics with practical use and equivalent education level
Training Delivery Methodology
The delivery is designed as a workshop style with an approximate 50/50 split between technical sessions and hands-on exercises, designed to explain the concepts by leveraging relevant industrial case studies.
Delivered online. Course package to be sent a week in advance with joining instructions and training material with Python Programming tutorials at Introductory level.
Technical Equipment
The training is delivered as a virtual classroom, using Microsoft Teams. Login information sent at the latest 2 days before the training.
Modes of Assessment
- Attendance sheet signed each half day by the participants and co-signed by ASTE
- Learning assessment based on:
- individual or group presentation with argument on a mini-project
- Individual plan for application around a specific project in the workplace of the participant (plan based on the methods and tools from the course)
- Training performance: qualitative assessment of the training by attendants at the end of the session
- Delivery of a training certificate
Access Deadline
- Open training: registration at the latest 7 days before the training
- In-house training: organisation within 4 weeks minimum.
Accessibility to Disabled People
Learning Outcomes
Upon completion of this module, the participant will be able to:
- Take the initiative in reviewing solutions for big data statistical analysis and processing.
- Analyse available data and produce results, or guide toward appropriate applications of Big Data Statistical Analysis.
- Identify correlations and construct statistical models from Engineering Big Data Resources.
- Interpret the results and explain them to non-specialists.
- Use practical software tools, with a focus on workflow design and experimentation.
Programme
This short course introduces participants to a hands-on development of specialist knowledge in statistical data analysis required to apply data science principles and to provide data-driven, innovative engineering solutions.
The course is organised as follows:
Fundamentals
- Data quality and data cleansing; Data preparation (statistical evaluation of data quality, data cleaning and data transformation);
- Exploring concepts of data cardinality, dimensionality, imbalance, similarity, feature selection;
- Engineering problem solving using Python Programming.
Data Pre-Processing
- Exploratory data visualisation and introduction to statistical concepts relevant to exploration of Big Data;
- Features engineering (importance, selection, dimensionality reduction, PCA).
Statistical Classification
- Discuss the methods to handle high dimensional data, large-samples, sample splitting;
- Data-driven algorithms for statistical models of engineering Big Data (classifiers, decision trees, Naïve Bayes);
- Bootstrap and bagging methods.
Special Topics
- Introduction to basket market analysis along with association rules;
- Grouping (k-means, high dimensional clustering, subspace clustering);
- Text Mining.
Mini Project
- Independent practice through application to a relevant Engineering Big Data individual project.