Artificial Intelligence for Big Data
上QQ阅读APP看书,第一时间看更新

The transformer function

This is something that can transform one DataFrame into another. For instance, an ML model can transform a DataFrame with features into a DataFrame with predictions. A transformer contains feature transformer and learned model. This uses the transform() method to transform one DataFrame into another. The code for this is given for your reference:

import org.apache.spark.ml.feature.Tokenizer

val df = spark.createDataFrame(Seq( ("This is the Transformer", 1.0), ("Transformer is pipeline component", 0.0))).toDF( "text", "label") val tokenizer = new Tokenizer().setInputCol("text").setOutputCol("words") val tokenizedDF = tokenizer.transform(df)