8 questions about big data

Q1. What are the most important characteristics of big data that separates them from data?

Q2. An employee information data stored as a table with four attributes (employee_id, name, phone, address) and 10,000 records can be considered as big data or not? Give your explanation.

Q3. A flow of new information in term of new articles, pictures, video, tweets, etc can be mined and stored in the cloud infrastructure. Can you describe the steps to extract some information from these unstructured and multi-source data? For example, you want to know the number of news about Brad Pitt and Angelina Jolie appear together, what kind of steps that needed to gather this information?

Q4. Why JSON and XML are considered to be semi-structured?

Q5. What is metadata? How can metadata help us in pattern recognition?

Q6. Statement: MapReduce can be used to extract pattern from unstructured data. Is the statement correct? Give your explanation.

Q7. Given the following content of a document:

I saw elba able was I is an example of palindrome. Palindrome is a sentence that reads the same backwards as forwards

Apply the three mapreduce steps to perform word counting for the content of the document. Note that the document has two sentences, not one sentence!

Q8. Given a matrix:

And a vector

Perform matrix-vector multiplication using mapreduce. Show all the three mapreduce steps

Post a Comment