DZone

Overview of Problem

Tabular data is quite common in various business and engineering problems. Machine learning can be used to predict particular columns of the table we are interested in, using other columns as input features. We will take an example of using historical house sales data to predict sales prices for houses that come on the market in the future. The house prices dataset from Kaggle contains such data for Ames, Iowa. It contains predictive columns like house area, neighborhood area name, type of building, house style, condition, year last sold, etc., among a total of 79 such predictive features. Some of these features are categorical while others are numerical and our goal is to predict the Sale Price (a numeric column) of houses using these features.

Id
LotArea
Neighborhood
BldgType
Style
Cond
YrBuilt
1stFlrSF
2ndFlrSF
Fireplaces
YrSold
SalePrice
1
8450
CollgCr
1Fam
2Story
5
2003
856
854
0
2008
208500
2
9600
Veenker
1Fam
1Story
8
1976
1262
0
1
2007
181500
3
11250
CollgCr
1Fam
2Story
5
2001
920
866
1
2008
223500
4
9550
Crawfor
1Fam
2Story
5
1915
961
756
1
2006
140000
5
14260
NoRidge
1Fam
2Story
5
2000
1145
1053
1
2008
250000
6
14115
Mitchel
1Fam
1.5Fin
5
1993
796
566
0
2009
143000
7
10084
Somerst
1Fam
1Story
5
2004
1694
0
1
2007
307000
8
10382
NWAmes
1Fam
2Story
6
1973
1107
983
2
2009
200000
9
6120
OldTown
1Fam
1.5Fin
5
1931
1022
752
2
2008
129900
10
7420
BrkSide
2fmCon
1.5Unf
6
1939
1077
0
2
2008
118000
11
11200
Sawyer
1Fam
1Story
5
1965
1040
0
0
2008
129500
12
11924
NridgHt
1Fam
2Story
5
2005
1182
1142
2
2006
345000
13
12968
Sawyer
1Fam
1Story
6
1962
912
0
0
2008
144000

Overview of Google AutoML Tables

Google AutoML Tables enables quick and high accuracy training and subsequent hosting of ML models for such a problem. Users can import and visualize the data, train a model, evaluate it on a test set, iterate on improving model accuracy and then host the best model for online/offline predictions. All of the above functionality is available as a service without any ML expertise or hardware or software installation required from users.
AutoML table can train both regression and classification models depending on the type of column we are trying to predict.

Source: DZone