Solve Data Sprint #19 Challenge | DPhi


This is a companion discussion topic for the original entry at

Hi Team,
I am getting invalid output error expected 1 or 0 while uploading my submission file .As it is a multiclassification problem and the number of outputs classes are 4.I want to know why we are getting the error on uploading the csv file.

@sagarnarula could you please submit now? we change the evaluation metric and it should work fine now.

cc: @anjum_r @kanishksh4rma @parth_nipun_dave @harshita13 @srinathkr

1 Like
target = pd.DataFrame()
target['prediction'] =y_pred

target.to_csv(‘Submission.csv’, index=False)‘Submission.csv’)``

Yes, done! Thanks :slight_smile:

How to improve accuracy? Do you guys try to change parameters and tune the models randomly? Or is there a better approach?Can you guys help me out?

I would suggest a RandomSearch or Gridsearch of the hyperparameters that are used in the model :), if that doesn’t work you could maybe try another model or do some feature selection/engineering?

Hi All,
Thanks @dphi for giving a nice competition to work on at the beginning of the year.
This time I wanted to do something unique (get the best solution in minimal lines of code). So here is my 6 lines code to get the second rank:-

import pandas as pd; import numpy as np; from sklearn.ensemble import ExtraTreesClassifier
train_df = pd.read_csv("" )
test_df  = pd.read_csv('')
train_y  = train_df['microorganism'].values
preds = ExtraTreesClassifier(n_estimators=200,random_state=2020,max_depth=21).fit(train_df.drop(['microorganism'],axis=1).values,train_y).predict(test_df.values)

I didn’t get the result in the first try though. I started with lightgbm, xgboost and catboost for my initial subs. Then used the library GML (developed by @muhammad4hmed and Naman) to get an idea as to which algorithms are performing well with 10 folds (link of demo present here :- and it showed Extra trees to be performing well than others. Since I got the best performing algorithm, I parameter tuned it and got the results.

1 Like

congratulations! btw the person on first also used GML xD so it was GML vs GML :joy: BTW impressive short solution!