Commit 8ec03b8a authored by Morten Brun's avatar Morten Brun
Browse files

improved clustering and added exercise

parent ccd2dd93
# Visual data analysis
# Visual machine learning
I will demonstrate some techniques for quick visual analysis of data.
This can be made into the first step into actual machine learning with
statistical validation.
## Exercise
* If you have not yet installed python, follow the instructions below
to do that.
* Start jupyter and excecute the notebook
[morten/python/proteomics_IV_pca.ipynb](python/proteomics_IV_pca.ipynb). (You
will need to use jupyter to navigate to it. If you press the link
you see the raw content of the file.)
* See if you understand what happened and why. Ask about the things
that are unclear. Answer the following questions:
* Where is the data file that is analyzed located? Open it with
Excel or another suitable application.
* What is the shape of the data in the object called `df`? Does it
change during the excecution?
* How did we use information about the dosages to learn about the
distribution of the samples treated with different dosages?
* Leave out one sample in the training stage and find a way to see
if the code can be used to predict its dosage.
* Leave out one sample from each dosage group and do the same as
above for all the samples that have been left out.
* Do the above multiple times and find the probability of the
prediction beeing correct.
* Execute the remaining notebooks in the morten/python folder.
* Analyze a new data set. Are you able to learn the exposures of the samples?
### Recommendation: Install python 3 with [Anaconda](https://www.anaconda.com/) for your OS.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment