What is Machine Learning Data?

How we split data in Machine Learning?

    • Training Data: The part of data we use to train our model. This is the data which your model actually sees (both input and output) and learn from.
    • Validation Data: The part of data which is used to do a frequent evaluation of model, fit on training dataset along with improving involved hyper parameters (initially set parameters before the model begins learning). This data plays its part when the model is actually training.
    • Testing Data: Once our model is completely trained, testing data provides the unbiased evaluation. When we feed in the inputs of Testing data, our model will predict some values (without seeing actual output). After prediction, we evaluate our model by comparing it with actual output present in the testing data. This is how we evaluate and see how much our model has learned from the experiences feed in as training data, set at the time of training.

Properties of Data –

    • Volume: Scale of Data. With growing world population and technology at exposure, huge data is being generated each and every millisecond.
    • Variety: Different forms of data – healthcare, images, videos, audio clippings.
    • Velocity: Rate of data streaming and generation.
    • Value: Meaningfulness of data in terms of information which researchers can infer from it.
    • Veracity: Certainty and correctness in data we are working on.

Read More:Data Scientist Certification Bangalore

prwatech Silver Asked on January 28, 2019 in Education.
Add Comment
0 Answer(s)

Your Answer

By posting your answer, you agree to the privacy policy and terms of service.
  • Website Development Company