Feature Engineering and Data Preparation for Analytics

Course code: DMDP42

This course introduces programming techniques to craft and feature engineer meaningful inputs to improve predictive modeling performance. In addition, this course provides strategies to preemptively spot and avoid common pitfalls that compromise the integrity of the data being used to build a predictive model. This course relies heavily on SAS programming techniques to accomplish the desired objectives.

The self-study e-learning includes:

  • Annotatable course notes in PDF format.
  • Virtual Lab time to practice.
1 080 EUR

1 307 EUR including VAT

Selection of dates
Do you have a question?
+420 731 175 867 edu@edutrainings.cz

and certified lecturers

recognized certifications

Wide range of technical
and soft skills courses

Great customer

Making courses
exactly to measure your needs

Course dates

Starting date: Upon request

Type: E-learning

Course duration: 21 hours

Language: en

Price without VAT: 1 080 EUR


Starting date: Upon request

Type: Upon request

Course duration: 21 hours

Language: en

Price without VAT: 1 800 EUR


Type Course
Language Price without VAT
Upon request E-learning 21 hours en 1 080 EUR Register
Upon request Upon request 21 hours en 1 800 EUR Register
G Guaranteed course

Didn't find a suitable date?

Write to us about listing an alternative tailor-made date.


Target group

Analysts, data scientists, and IT professionals looking to craft better inputs to improve predictive modeling performance

Course structure

Extracting Relevant Data

  • Data difficulties.
  • Assessing available data.
  • Accessing available data.
  • Drawing a representative target sample.
  • Drawing an uncontaminated input sample.

Transforming Transaction and Event Data

  • Advantages and disadvantages of transactions data.
  • Common transaction structures.
  • Defining the time horizon.
  • Fixed and variable time horizon methods.
  • Implementing common transaction transformations.

Using Nonnumeric Data

  • Definitions and difficulties of nonnumeric data.
  • Miscoding and multicoding detection.
  • Controlling degrees of freedom.
  • Geocoding.

Managing Data Pathologies

  • Exploring input variable distributions.
  • Detecting data anomalies.
  • Creating custom exploratory tools for candidate input variables.
  • Missing value imputation.
  • Data partitioning.


This course assumes some experience in both predictive modeling and SAS programming. Before attending this course, you should have:
  • Exposure to DATA step programming equivalent to the SAS Programming 1: Essentials course.
  • Exposure to programming in SQL or the SQL procedure.
  • Exposure to querying data in PROC SQL and building and deploying a predictive model.
  • Familiarity with the analytical process of building predictive models and scoring new data.
  • Do you need advice or a tailor-made course?


    product support

    ComGate payment gateway MasterCard Logo Visa logo