You are given the attached CSV file for the purpose of quantifying the behavior of
Drosophila Larvae using unsupervised machine learning. The data is taken from an
experiment, where multiple larvae are allowed to freely explore some space, guided by
an odor. Each row of the dataset captures some behavioral parameters of one individual.
The column frame describes the real (physical) time of the experiment, the column id
describes the id of the individual. This follows with some behavioral parameters for this
larvae id and this physical time step. Example:
frame id parameterA
Notable parametersthat we want to focus on are the spinepoint_x_n and spinepoint_y_n
columns, which depict (x,y) coordinates fitted to the spine of each larvae. Plotting the
spine points creates the following picture:
Now carry out the following analysis of the dataset. Document everything in a python
file or an ipython notebook that you can later share with us.
1. The spine point data containssome NaN (not a number) valuesin the spinepoint_x_n
and spinepoint_y_n columns, that are marked as nan in the CSV data. Carry
out linear interpolation for any NaN values, using the values before and after the
NaN rows. This can be done with the interp command from the numpy pack-
2. Now, let us fit a polynomial function with a degree of 4 to the (x,y) spine point
data. This can be done with the polyfit function from numpy. Let us then
compute the residual error e from the fit x ̃ to the datapoints x, using the RMSE
3. Now, compute a polynomial fit with the degree of 8. How doesthe RMSE change?
4. We now want to perform clustering of the absolute value of the polynomial co-
efficients obtained from the first fit (4th degree). Let us for this purpose use the
KMeans functionsfrom the scikit.cluster package. Carry out the k-means
fittings for a cluster size k = [2, 4, 6, . . . , 30]. Using the inertia property for
each clustering assignment, create a Elbow Plot that suggests the optimal k for
clustering of the coefficients.
5. Now, repeat this using the 8th degree polynomial fit. What is the optimal kbased
on the Elbow method for this feature set?
6. Pick one choice of k from the 4th degree polynomial fit, based on the Elbow
method. Create a visualization (histogram) showing how many total data rows
have been assigned to each cluster. Compare to the same k from the 8th degree
fit. Bonus: visualize the average spine configuration obtained for each cluster.
7. Now, compute the average spine_length variable for each cluster. What do
8. Dividing the dataset according to the area variable in two labels (label A=area
below a value of 450, label B=area above a value of 450). Create a classifier using
your favorite method available in the scikit package that classifies into labels
A and B. Create a random train and test data split with 10 and 90 percent of your
data, respectively. First, carry out the classification based only on your regression
coefficients (4th degree or 8th degree) and then, based on all the data available in
the data frame. What kind of 10-fold cross-validation accuracy do you get for your
method for both cases?
9. Imagine, you would want to design an artificial neural network (ANN) that needs
to classify the spine point data to some training labels. You can imagine the train-
ing labels to be cluster assignment obtained from your k-mean assignments. What
kind of network architecture would you suggest for this kind of classification task?
Sketch a network diagram that you could come up with to do this classification.
10. Create a model function that would initialize your designed ANN model in the
pytorch or tensorflow development API. You can think of a class
NeuralNet (nn.Module) initialization that creates the building blocks of the
ANN and assigned the connections between
Codersarts is a top-rated website for students which is looking for online Programming Assignment Help, Homework Help, Coursework Help in C, C++, Java, Python, Database, Data structure, Algorithms, Final year project, Android, Web, C sharp, ASP NET to students at all levels whether it is school, college and university level Coursework Help or Real-time project. Hire us and Get your projects done by a computer science expert.
Contact Codersarts for any such project/assignment help