Do Video Game Descriptions Help Determine a Game's Overall Rating?

by: J.Roberge

How do descriptions effect a video game's performance. Do high ranking video games have a more detailed description and most importantly, can this description be used to predict a video games performance? This analysis will utilize a previously curated dataset that has tokenized the description of 10,000 video games which were found on itunes. The issues going forward will be how to most effectively deal with an extremely sparse matrix and which type of model performs best on this type of dataset. This analysis will be broken down into three sections: section One, will deal with outlier detection; section two, will reduce the dimensions; and section 3 will fit the models.

Please note, that the way I fit these models is not 'correct' (you shouldn't touch your y_test until the very end), but method used was only employed do to the constraints of the assignment (which can be found here)

Table of Contents

In [1]:
### importing dependencies ####
import pandas as pd
import numpy as np

### dimension reduction teckniques
from sklearn.decomposition import PCA
from sklearn.decomposition import SparsePCA
from sklearn.manifold import TSNE


### Validation techniques/ search techniques
from sklearn.metrics import f1_score
from sklearn.metrics import accuracy_score
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import train_test_split
from sklearn.model_selection import ParameterGrid


## models 
from sklearn.neighbors import KNeighborsClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import GradientBoostingClassifier
from xgboost import XGBClassifier

### outlier
from sklearn.preprocessing import StandardScaler
from sklearn.covariance import EllipticEnvelope
from sklearn.ensemble import IsolationForest
from sklearn.neighbors import LocalOutlierFactor

### plotting
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import seaborn as sns

### feature selection
from xgboost import plot_importance
from sklearn.feature_selection import SelectKBest, chi2, f_regression
In [2]:
### reading in Data Frame
%cd "C:\Users\jwr17\OneDrive - University of New Hampshire\machine learning\quiz_2\Quiz 2 ML"
### importing data sets ###
df_token=pd.read_csv("Apple1000games.csv", index_col=0)

df_des=pd.read_csv("apple1000new.csv")
C:\Users\jwr17\OneDrive - University of New Hampshire\machine learning\quiz_2\Quiz 2 ML
In [3]:
### mapping to categorical 

# df_des
cuts=pd.cut(df_des['Average User Rating'], [-1, 2.99, 3.99, 5.5], labels=['Poor', 'Good', "Great"])

# adding cuts to the description csv
df_des['User_cat']=cuts
In [5]:
### checking for missing values
df_des.User_cat.isna().sum()

### im going to impute missing values to Poor do to the size of the
### data set (loosing 100 rows is significant and i believe it is safe to assume that games without a ratting are more than lickley poor)
df_des['User_cat']=df_des.User_cat.fillna('Poor')
print("Shape fo description file", df_des.shape)
display(df_des.head(3))

print("\nShape fo token file", df_token.shape)
display(df_token.head())
Shape fo description file (1000, 29)
Name Icon URL Average User Rating User Rating Count Price In-app Purchases Description Developer Age Rating Languages ... Unnamed: 19 Unnamed: 20 Unnamed: 21 Unnamed: 22 Unnamed: 23 Unnamed: 24 Unnamed: 25 Unnamed: 26 Unnamed: 27 User_cat
0 Sudoku https://is2-ssl.mzstatic.com/image/thumb/Purpl... 4.0 3553.0 2.99 NaN Join over 21,000,000 of our fans and download ... Mighty Mighty Good Games 4+ DA, NL, EN, FI, FR, DE, IT, JA, KO, NB, PL, PT... ... NaN NaN NaN NaN NaN NaN NaN NaN NaN Great
1 Reversi https://is4-ssl.mzstatic.com/image/thumb/Purpl... 3.5 284.0 1.99 NaN The classic game of Reversi, also known as Oth... Kiss The Machine 4+ EN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN Good
2 Morocco https://is5-ssl.mzstatic.com/image/thumb/Purpl... 3.0 8376.0 0.00 NaN Play the classic strategy game Othello (also k... Bayou Games 4+ EN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN Good

3 rows × 29 columns

Shape fo token file (1000, 2000)
abil abl abov absolut access accomplish accord account accur ace ... you\u2019r you\u2019v young youtub z zero zingoplaylit zombi zone zoom
0 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
2 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
3 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
4 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0

5 rows × 2000 columns

Section One: Outliers Detection

I'm currently at a crossroads with outlier detection. This is a sparse matrixs and I worry that a cell with any sort of value in it may be considered an outlier. In order to ensure I'm not just getting rid of values good data I will sort thorough what is considered an outlier.

In [ ]:
#### outlier Detection ####

# I am currently at a corssroads with outlier detection. Im a little worried about the data being sparse and trying to run outlier
# analysis on it. 

### standardization ####
standard=StandardScaler().fit_transform(df_token)
df_token=pd.DataFrame(standard, columns=df_token.columns)

### mahalanobis ###
clf = EllipticEnvelope(contamination=.05,random_state=0)
clf.fit(df_token)
Envelope_prediction= clf.predict(df_token)
envelope_scroes= pd.Series(clf.decision_function(df_token)) 

### isolation forest ###
clf = IsolationForest( n_estimators=400, random_state=4, n_jobs=-1, contamination=.05)
clf.fit(df_token)
outliers=clf.predict(df_token)
TagRemovePreprocessor.remove_all_outputs_tags

After reviewing the outliers it seems that the mahalanobis and isolation forest performed well. From here I will drop everything that it considered to be an outlier.

In [8]:
#### dropping outliers ###
outliers_master=pd.concat([pd.Series(Envelope_prediction), pd.Series(outliers)], axis=1)
outliers_master.columns=['mahalanobis', 'isolation']
outliers_master[outliers_master.isolation<1]
print("The shape of the outlier dataframe: ",outliers_master.shape)
display(outliers_master.head(3))

### Droping outliers
df_token=df_token[(outliers_master.isolation==1) | (outliers_master.mahalanobis ==1)]
df_des=df_des[(outliers_master.isolation==1) | (outliers_master.mahalanobis ==1)]

print("The shape of the token dataframe after outlier drop:", df_token.shape)
display(df_token.head(3))

print("The shape of the description dataframe  after outlier drop:", df_des.shape)
display(df_des.head(3))
The shape of the outlier dataframe:  (1000, 2)
mahalanobis isolation
0 1 1
1 1 1
2 1 1
The shape of the token dataframe after outlier drop: (991, 2000)
abil abl abov absolut access accomplish accord account accur ace ... you\u2019r you\u2019v young youtub z zero zingoplaylit zombi zone zoom
0 -0.233203 -0.251087 -0.152507 -0.162088 -0.220433 -0.110208 -0.112333 -0.171306 -0.105463 -0.085064 ... -0.20989 -0.143463 -0.095298 -0.153432 -0.054313 -0.096659 -0.031639 -0.118664 -0.067919 -0.168645
1 -0.233203 -0.251087 -0.152507 -0.162088 -0.220433 -0.110208 -0.112333 -0.171306 -0.105463 -0.085064 ... -0.20989 -0.143463 -0.095298 -0.153432 -0.054313 -0.096659 -0.031639 -0.118664 -0.067919 -0.168645
2 -0.233203 -0.251087 -0.152507 -0.162088 -0.220433 -0.110208 -0.112333 -0.171306 -0.105463 -0.085064 ... -0.20989 -0.143463 -0.095298 -0.153432 -0.054313 -0.096659 -0.031639 -0.118664 -0.067919 -0.168645

3 rows × 2000 columns

The shape of the description dataframe  after outlier drop: (991, 29)
Name Icon URL Average User Rating User Rating Count Price In-app Purchases Description Developer Age Rating Languages ... Unnamed: 19 Unnamed: 20 Unnamed: 21 Unnamed: 22 Unnamed: 23 Unnamed: 24 Unnamed: 25 Unnamed: 26 Unnamed: 27 User_cat
0 Sudoku https://is2-ssl.mzstatic.com/image/thumb/Purpl... 4.0 3553.0 2.99 NaN Join over 21,000,000 of our fans and download ... Mighty Mighty Good Games 4+ DA, NL, EN, FI, FR, DE, IT, JA, KO, NB, PL, PT... ... NaN NaN NaN NaN NaN NaN NaN NaN NaN Great
1 Reversi https://is4-ssl.mzstatic.com/image/thumb/Purpl... 3.5 284.0 1.99 NaN The classic game of Reversi, also known as Oth... Kiss The Machine 4+ EN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN Good
2 Morocco https://is5-ssl.mzstatic.com/image/thumb/Purpl... 3.0 8376.0 0.00 NaN Play the classic strategy game Othello (also k... Bayou Games 4+ EN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN Good

3 rows × 29 columns

Section Two: Dimension Reduction

Currently there is around 10,00 features in this dataset, and these features are extremely sparse. In order to get a working model, dimension reduction must take place. I will employ three dimension reduction techniques: first, I will use Principle Component Analysis and I will keep 90% of the variation; Second, I will employ Sparse PCA and keep the top 20 Components; and third I will use T_SNE and keep three components.

In [9]:
## pca's
pca=PCA(0.9).fit_transform(df_token)

master = pd.DataFrame(pca, columns=['PC_'+str(i) for i in range(1,pca.shape[1]+1)])

## sparse pca
sparse=SparsePCA(n_components=20, n_jobs=-1).fit_transform(df_token)
sparse= pd.DataFrame(sparse, columns=['Sparse_'+str(i) for i in range(1,21)])
master=pd.concat([sparse,master], axis=1)
del(sparse)
del(pca)

print("Shape of the reduced dimension df with sparse and pca", master.shape)
display(master.head(3))
C:\Users\jwr17\Anaconda3\envs\PyhtonAndR\lib\site-packages\sklearn\decomposition\sparse_pca.py:170: DeprecationWarning: normalize_components=False is a backward-compatible setting that implements a non-standard definition of sparse PCA. This compatibility mode will be removed in 0.22.
  DeprecationWarning)
Shape of the reduced dimension df with sparse and pca (991, 458)
Sparse_1 Sparse_2 Sparse_3 Sparse_4 Sparse_5 Sparse_6 Sparse_7 Sparse_8 Sparse_9 Sparse_10 ... PC_429 PC_430 PC_431 PC_432 PC_433 PC_434 PC_435 PC_436 PC_437 PC_438
0 0.003111 -0.007090 -0.001605 0.004124 -0.003503 0.001851 -0.003369 0.001227 -0.000279 0.031796 ... -0.020590 0.021111 0.321117 1.142703 -0.307474 0.153247 0.007989 -0.026510 -0.548945 -0.393779
1 0.003174 -0.001326 0.000506 0.008860 -0.005468 -0.008240 -0.007486 0.005104 -0.004304 0.015580 ... 0.432841 -1.572423 -0.050012 -3.641469 -2.554754 -0.282550 -1.758610 -1.931872 1.902900 -0.087452
2 0.002818 -0.005823 0.001213 0.002864 -0.001885 -0.007623 0.001609 0.006572 -0.000774 0.018426 ... -1.541419 -1.448775 0.905741 -0.183866 1.045458 0.302338 -1.017309 -0.331573 1.140685 -0.280409

3 rows × 458 columns

T_SNE

After running through several interactions, and after doing some research I have decided to change the measuring metric on T_SNE from euclidean to cosine distance. According to some research, changing the distance measurement to cosine empirically performs better on a sparse matrices. Additionally, the plot does seems to be slightly better. (the plot will print outside of the notebook)

In [31]:
### t_sne
TSNE()
X_embedding = TSNE(metric='cosine',n_components=3, perplexity=1000, n_iter = 500, learning_rate=500).fit_transform(df_token,)

### 3d Graph Prints outside notebook ###
get_ipython().run_line_magic('matplotlib', 'inline')
df_sne_3d=pd.DataFrame(X_embedding, columns=['tsne_1',"tsne_2","tsne_3"])
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
color=[]
for a in df_des['User_cat']:
    if a=='Great':
        color.append('green')
    elif a=='Good':
        color.append('yellow')
    elif a=='Poor':
        color.append('red')
    else:
        color.append('NaN')
ax.scatter3D(xs=df_sne_3d.tsne_1, ys=df_sne_3d.tsne_2, zs=df_sne_3d.tsne_3,c=color, marker='o')
Out[31]:
<mpl_toolkits.mplot3d.art3d.Path3DCollection at 0x21507d88908>
In [15]:
### appending t-sne to master
master=pd.concat([master, df_sne_3d],axis=1)
print("Shape of the reduced dimension df with tsne, sparse and pca", master.shape)
display(master.head(3))

### Cleaning up my memory
del(df_sne_3d, cuts, color,ax,fig,standard,outliers,outliers_master,Envelope_prediction,X_embedding,a)
Shape of the reduced dimension df with tsne, sparse and pca (991, 464)
Sparse_1 Sparse_2 Sparse_3 Sparse_4 Sparse_5 Sparse_6 Sparse_7 Sparse_8 Sparse_9 Sparse_10 ... PC_435 PC_436 PC_437 PC_438 tsne_1 tsne_2 tsne_3 tsne_1 tsne_2 tsne_3
0 0.003111 -0.007090 -0.001605 0.004124 -0.003503 0.001851 -0.003369 0.001227 -0.000279 0.031796 ... 0.007989 -0.026510 -0.548945 -0.393779 -54.483097 -7.862460 -46.616386 79.922234 75.020866 -11.045379
1 0.003174 -0.001326 0.000506 0.008860 -0.005468 -0.008240 -0.007486 0.005104 -0.004304 0.015580 ... -1.758610 -1.931872 1.902900 -0.087452 40.230015 29.511326 42.572079 -71.080284 -10.586238 -64.524429
2 0.002818 -0.005823 0.001213 0.002864 -0.001885 -0.007623 0.001609 0.006572 -0.000774 0.018426 ... -1.017309 -0.331573 1.140685 -0.280409 13.820995 -53.411522 -69.516098 1.341141 0.462629 34.949226

3 rows × 464 columns

The following graph shows the total count of all target values. I made this graph to see the balance of the classes. In all, I would say that the class balance is pretty good, and therefor I will not pursue any type of re-sampling techniques.

In [458]:
### seeing how the classes balance out

get_ipython().run_line_magic('matplotlib', 'inline')
sns.countplot(df_des['User_cat'])
Out[458]:
<matplotlib.axes._subplots.AxesSubplot at 0x1a8f66f6788>

Section Three: Training the models

Do to the constraint of the assignment I will be limited in my use in using gridsearchcv(), and this is because I can't fit my y_testing data on individual models when I run grid search. In order to alleviate this issue I will use Paramgrid(), which is a function that returns the grid of gridsearchcv(). This param grid will be used to fit indivdual models throughout the analysis. The fitting process will be broken down into four parts. Part one, will be the dictionary layout for paramgrid. Part two will be the functions that I plan to use for the modeling fitting. Part three will be fitting all possible models using four types of features which are all possible features, TSNE features, Sparse PCA features and PCA features. The Last Part will fit models all possible models based on feature importance.

Part One: Param Dictionaries

In [20]:
#### dictionaries for paramgrid ###

knn_params={
    'p':[2],
    'n_neighbors':[5,10,20,50],
    'n_jobs':[-1]
}

random_params={
    'n_estimators':[100,500],
    'min_samples_split':[5],
    'min_samples_leaf':[5],
    'n_jobs':[-1]
}
gradient_params={
    'learning_rate':[.1,.01,.001],
    'n_estimators':[100,500],
    'subsample':[.6,.8,1],
    'min_samples_split':[5],
    'min_samples_leaf':[5],
    'random_state':[4],
}
xg_params={
    'learning_rate':[.1,.01,.001],
    'n_estimators':[100,500],
    'subsample':[.6,.8,1],
    'max_depth':[5,7,9],
    'n_jobs':[-1]
}

master_params=[knn_params, random_params, gradient_params, xg_params]

Part Two: Function for the models

The following function takes in a model and a parameter grid and fits all possible models on a training set and then predicts the testing set

In [470]:
from sklearn.model_selection import ParameterGrid
from sklearn.model_selection import fit_grid_point

def get_model(model,x_train, y_train, x_test,y_test, param, scorring1, scorring2):
    """
    This function fits and returns a serries for a dataframe
    What it returns: all models given the paramter grid fit to the test data.
    """
    ### creates the paramter grid
    param_grid=ParameterGrid(param)
    ### puts the param grid into a dataframe
    df_param_grid=pd.DataFrame(param_grid)
    master_score1=[]
    master_score2=[]
    ### assigning clasifier name to the data frame
    df_param_grid['Clasifier']=model.__name__
    
    for params in param_grid:
        clf=model()
        ## setting the individual parameter grid to the function
        clf.set_params(**params)
        ### fitting the model
        clf.fit(x_train.values,y_train.astype(str).values)
        pred=clf.predict(x_test.values)
        ### calculating the score of y_test vs y_pred
        score=scorring1(y_test.astype(str).values, pred)
        print(score)
        master_score1.append(score)
        score=scorring2(y_test.astype(str).values, pred, average='macro')
        print(score)
        master_score2.append(score)
    ### assining the scores to the param grid df   
    df_param_grid[scorring1.__name__]=master_score1
    df_param_grid[scorring2.__name__]=master_score2

    return df_param_grid
In [28]:
x=ParameterGrid(xg_params)
for i in x:
    x=XGBClassifier(**i)
    x.fit(y_train)
    x.
Out[28]:
{'subsample': 1,
 'n_jobs': -1,
 'n_estimators': 100,
 'max_depth': 5,
 'learning_rate': 0.1}

Part Three: Master Model Run

The following for loop fits all possible models (see master model list) across all possible parameters (see parameter dictionary), and then fits all models across the different dimension reduction techniques.

In [ ]:
master_params=[knn_params, random_params, gradient_params, xg_params]
master_models=[KNeighborsClassifier, RandomForestClassifier, GradientBoostingClassifier, XGBClassifier]
target=df_des['User_cat'].astype(str)

### column list for fitting th different models
master_cols=master.columns
sparse_cols=[col for col in master.columns if 'Sparse' in col]
PC_cols=[col for col in master.columns if 'PC' in col]
TSNE=[col for col in master.columns if 'tsne' in col]
col_list=[master_cols, sparse_cols, PC_cols, sparse_cols]
col_names=['all', 'sparse', 'PCA', 'TSNE']


master_df=pd.DataFrame()

for i ,col in enumerate(col_list):
    ## splits train and Test 
    x_train, x_test, y_train, y_test= train_test_split(master[col], target,test_size=.2) 
    
    ### fits model on dimensions col
    for j ,model in enumerate(master_models):
        df=get_model(model, x_train, y_train,x_test,y_test, master_params[j], accuracy_score, f1_score)
        df['features']=col_names[i]
        master_df=pd.concat([master_df, df], axis=0)

master_df.to_csv("master_quiz_3.csv")
In [18]:
### Observing The output ###
print("Shape of Paramgrid: ", master_df.shape)
display(master_df.head(3))
Shape of Paramgrid:  (312, 15)
Unnamed: 0 Clasifier accuracy_score f1_score features learning_rate max_depth min_samples_leaf min_samples_split n_estimators n_jobs n_neighbors p random_state subsample
0 0 KNeighborsClassifier 0.336683 0.313983 all NaN NaN NaN NaN NaN -1.0 5.0 2.0 NaN NaN
1 1 KNeighborsClassifier 0.346734 0.339408 all NaN NaN NaN NaN NaN -1.0 10.0 2.0 NaN NaN
2 2 KNeighborsClassifier 0.366834 0.352205 all NaN NaN NaN NaN NaN -1.0 20.0 2.0 NaN NaN

Part Four: Models from Feature Importance

The following are models based on feature importance. The features importance was determined through XGBoost's feature importance moduel. The model fit from XGBoost feature importance was determined using the best fit model from the previous model fitting. The features for this model run was determined using three criteria: information gain, weight, and total information gain. These features will be broken down into four categories and these categoreis will be run through all possible models.

In [547]:
### looking through the results i beleive a combination of features may produce the best results, for this excercise i will
### pick what seem to be the best features. 

## step one using the best model thuse far (using f1-score) and extracting the best features
x_train, x_test, y_train, y_test= train_test_split(master, target,test_size=.2) 

### fitting the top model from previous model fittings
clf=XGBClassifier(learning_rate=.1, max_depth=7, n_estimators=100, n_jobs=-1, subsample=1.)
clf.fit(x_train,y_train)

### plotting best features
### To get gain and weight chnage importance_type='total_gain'
plot_importance(clf,max_num_features=20, importance_type='total_gain')
Out[547]:
<matplotlib.axes._subplots.AxesSubplot at 0x1a8ff343048>
In [ ]:
#### fitting model after slecting features based on gain, weight and total gain

total_gain=['PC_1','PC_12','PC_3','PC_10','PC_220', 'PC_16','PC_45','PC_54','PC_57', 'tsne_3']
weight=['tsne_1','tsne_2','tsne_3','Pc_12', 'PC_3', 'PC_16', 'Sparse_1', 'Sparse_11','PC_9', 'PC_4']
gain=['PC_1','PC_57','PC_177','PC_408', 'PC_10', 'PC_328','PC_126','PC_195','PC_278','PC_307']
all_select=set(total_gain+weight+gain)

col=[total_gain, weight, gain, all_select]
col_names=['Total_Gain', 'Weight', 'Gain', 'All_select']

select_master=pd.DataFrame()

for i ,col in enumerate(col_list):
    ## splits train and Test 
    x_train, x_test, y_train, y_test= train_test_split(master[col], target,test_size=.2) 
    
    ### fits model on dimensions col
    for j ,model in enumerate(master_models):
        df=get_model(model, x_train, y_train,x_test,y_test, master_params[j], accuracy_score, f1_score)
        df['features']=col_names[i]
        select_master=pd.concat([select_master, df], axis=0)
In [554]:
select_master.sort_values('accuracy_score')
Out[554]:
Clasifier accuracy_score f1_score features learning_rate max_depth min_samples_leaf min_samples_split n_estimators n_jobs n_neighbors p random_state subsample
3 KNeighborsClassifier 0.261307 0.206945 Gain NaN NaN NaN NaN NaN -1.0 50.0 2.0 NaN NaN
2 KNeighborsClassifier 0.336683 0.305608 Total_Gain NaN NaN NaN NaN NaN -1.0 20.0 2.0 NaN NaN
3 KNeighborsClassifier 0.336683 0.289420 Total_Gain NaN NaN NaN NaN NaN -1.0 50.0 2.0 NaN NaN
1 KNeighborsClassifier 0.341709 0.333114 Total_Gain NaN NaN NaN NaN NaN -1.0 10.0 2.0 NaN NaN
2 KNeighborsClassifier 0.341709 0.330366 Gain NaN NaN NaN NaN NaN -1.0 20.0 2.0 NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
49 XGBClassifier 0.582915 0.545212 Gain 0.001 9.0 NaN NaN 100.0 -1.0 NaN NaN NaN 0.8
6 XGBClassifier 0.582915 0.535810 Gain 0.100 7.0 NaN NaN 100.0 -1.0 NaN NaN NaN 0.6
51 XGBClassifier 0.582915 0.528345 Gain 0.001 9.0 NaN NaN 500.0 -1.0 NaN NaN NaN 0.6
10 GradientBoostingClassifier 0.582915 0.524818 Gain 0.010 NaN 5.0 5.0 500.0 NaN NaN NaN 4.0 0.8
7 GradientBoostingClassifier 0.592965 0.507791 Gain 0.010 NaN 5.0 5.0 100.0 NaN NaN NaN 4.0 0.8

312 rows × 14 columns

In [566]:
master_model=pd.concat([select_master,master_df])


master_model['Your Name']='Joshua Roberge'
master_model['Random State']=4
master_model_1=master_model.drop(['p','random_state', 'min_samples_split', 'n_jobs','min_samples_leaf'], axis=1)
master_model_1=master_model_1.rename(columns={'Clasifier': 'Algorithm'})
master_model_1.to_csv('Master_model_1.csv')

Conclusion

At This point in time I can not conclusively say that a video game's performance effects its outcome, nor can I say that a video games description has any sort of predictive capabilities for predicting a video games performance.

Going Forward:

Not all is lost! Perhaps a deeper dive into NLP could solve the issue. Going forward the use of a lexicon could prove beneficial and produce better predictive features. Additionally, I believe playing around with transformations on the matrix could improve the model's overall performance.