Machine Learning with Swift
上QQ阅读APP看书,第一时间看更新

Exporting the model for iOS

In our Jupyter notebook, execute the following code to export the model:

In []: 
import coremltools as coreml 
coreml_model = coreml.converters.sklearn.convert(tree_model, feature_names, 'label') 
coreml_model.author = "Author name goes here..." 
coreml_model.license = "License type goes here ..." 
coreml_model.short_description = "Decision tree classifier for extraterrestrials." 
coreml_model.input_description['data'] = "Extraterrestrials features" 
coreml_model.output_description['prob'] =  "Probability of belonging to class." 
coreml_model.save('DecisionTree.mlmodel') 

The code creates the tree.mlmodel file next to the Jupyter notebook file. This file can contain a single model, a model pipeline (several models chained one after another), or a list of scikit-learn models. According to the documentation, the scikit-learn converter supports the following types of machine learning models:

  • Decision tree learning
  • Tree ensembles
  • Random forests
  • Gradient boosting
  • Linear and logistic regression (see Chapter 5, Association Rule Learning)
  • Support vector machines (several types)

It also supports the following data transformations:

  • Normalizer
  • Imputer
  • Standard scaler
  • DictVectorizer
  • One-hot encoder

Note that you can embed one-hot encoding as a part of pipeline, so you don't need to do it yourself in your Swift code. This is handy, because you don't need to keep track of the proper order of categorical variable levels.

The .mlmodel file can be one of three types: classifier, regressor, or a transformer, depending on the last model in the list, or a pipeline. It is important to understand that there is no direct correspondence between scikit-learn models (or other source framework) and Core ML models that run on a device. Because Core ML sources are closed, we don't know how it operates under the hood, and can't be sure that the model before and after the conversion will produce identical results. This means you need to validate the model after device deployment, to measure its performance and accuracy.