PyCon Taiwan 2020 Tutorial - How to develop ML APIs with Python from Online Learning Dataset

本課程僅提供報名 PyCon Taiwan 2020 議程會眾報名,如想參加請先至大會購票頁面購票後再至此報名參加。

 

Tutorial Info 課程說明

Abstract 摘要

Recently, Python engineers have more opportunities to work with data scientists than before. At the same time, they are often faced with the research-oriented code written by researchers or data scientists. In order to integrate this code with systems such as APIs, python engineers need to additionally write the code or refactor it, and make them work on the server.

This tutorial will provide chances to experience the whole process from analysis to API development by using python. In more detail, this tutorial covers the gap between the research-oriented code and production code of machine learning APIs. What is the gap between them? How it can be implemented based on real-world python code? How can the code validate whether the dataset is correct? How can machine learning models be continuously inspected? Audiences can earn the answers to these questions from this tutorial.

Goal 目標

  1. The audiences can be engineers, data scientists, or researchers at junior or middle level who use python.
  2. This talk could be more beneficial for people who have experience being involved with AI / ML projects.
  3. One of the major things that the talk can provide could be that audiences would learn each responsibility of each role in AI / ML projects and process to implement AI / ML products, and the better or best practices of ML API implementation.

Speaker Bio 講者介紹

Jesse is a software engineer working at Classi, which is the leading EdTech company in Tokyo. He has been developing a recommender engine which can optimize appropriate learning materials according to learner's abilities. It is based on a statistical method well-known in educational psychology and implemented with python. Prior to this, he used to research the relationships between online learning behaviors and learning outcomes at the UCL Institute of Education (IOE) in the UK. His interest is in how to bridge the gap between data science and engineering.

Detail Description 詳細說明

This tutorial will provide chances to experience the whole process from analysis to API development by using python and online learning dataset. In other words, audience can experience how the results generated from data science are used for machine learning products. At the same time, audience which is involved with AI/ML projects can know best or better practices of ML APIs development with pythonic code.  

The tutorial use a original sample ML API integrated with the ML models to predict online test results based on Item Response Theory (IRT). Other than IRT and online learning dataset, there are some tools used in this tutorial like Python, Flask, Locust (which is a flask based loading test tool), GKE, and Cloud Monitoring.
 

Outline 課程大綱

Introduction (10 minutes)

  • Why do I talk about this topic ? (1 minute) 
  • The Three Steps to develop ML APIs with Python from Online Learning Dataset (2 minute)
  • Explain what ML APIs we develop (7 minute) (The tutorial use a original sample ML API integrated with the ML models to predict online test results based on Item Response Theory (IRT)

Main Talk (70 minutes)

First Step: Write Research Oriented Code (30 minutes)

■ Explain an itterative process of research

  • Create features / Generate machine learning models / Evaluate the models and results

■ Create features, which means feature engineering + (demo code using online learning sample dataset)

  • Caliculate average, mean, count, variance, quartile, correlation, and etc.
  • Visualize the data with the Histogram, Scatter plot, Box plot, Heatmap and etc.
  • Preprocess with the four operations such as delete(drop), filter, replace, de-duplicate

■ Generate machine learning models + (demo code using online learning sample dataset)

  • Based on what data you understood, select appropriate algorithms (explain how to pick up appropriate alogorithms)
  • Implement the algorithms with python and python library 
  • Input features into the appropriate algorithms 

■ Evaluate machine leanring models and the results from the models + (demo code using online learning sample dataset)

  • Based on the results that the algorithms calculated, select appropriate evaluation methods (show the list of evaluation methods in a petagogical domain and explain how to choose appropriate method based on an academic criterion)

Second Step: Transform Research Oriented Code into ML APIs (20 minutes)

■ Understand what code you look at and make, and how to handle the code.

  • What is Research Oriented Code ?
  • What are ML APIs ?
  • How should engineers handle research oriented code ?

■ Modularize research oriented code + (demo code using sample python code)

  • Categorize research oriented code into preparation code, preprocessing code, and ML code
  • Break them out into functions and make them testable
  • Clarify input and output of the code, and define URI

■ Refactor research oriented code + (demo code using sample python code)

  • Prepare for refactoring by understanding requirements of each code and taking notes about them
  • Simplify I/O code in preparation code
  • Transform coding style with pandas into purely pythonic code in preprocessing code

Third Step: Deploy ML APIs on ML pipelines (20 minutes)

■ Check the ML APIs can work correctly on the server

  • Write decorators to automatically check parameters
  • Suggest a tiny example of CI/CD environments for ML APIs.
  • Set up production-like environments with GKE

■ Check system performance

  • Execture loading test with Locust and implement system log
  • Check system performance using Cloud Monitoring

■ Check model parameters

  • Exectute simulation / senario test with Locust and implement analysis log
  • Check model parameters using Cloud Monitoring

Summary (5 minutes)

  • Quickly review the Three Steps to develop ML APIs with Python from Online Learning Dataset (2 minute)
  • Share the useful resources and tips related to what is mentioned in the talk. (3 minute)

Requirement 要求

  • 上課的學員請自備筆電

Receipt Policy 發票處理說明

  • Payment receipt will be sent over email.
    可提供發票,我們將寄送電子發票到您的電子信箱。
  • Please provide your Company Name and Unified Business Number.
    請註明統一編號和公司抬頭。
台南好想工作室 / 台南市東區北門路二段16號 L2A

活動票券

票種 販售時間 售價
Regular 一般票

2020/08/13 00:00(+0800) ~ 2020/09/06 00:00(+0800) 結束販售
  • 免費
下一步