What is Unstructured Data?
Unstructured data refers to unfiltered information with no fixed organizing principle. It is often called raw data. Common examples are web logs, XML, JSON, text documents, images, video, and audio files. Unstructured data is searched and parsed to extract useful facts. As much as 80% of enterprise data is unstructured. This means it is the most visible form of big data to many people. The size of unstructured data requires scalable analytics to produce insights. Unstructured data is found in most but not all data lakes because of the lower cost of storage.
There is more noise than value in unstructured data. Extracting the value hidden in such files requires strong skills and tools. There is a myth that relational databases cannot process unstructured data. Teradata's Unified Data Architecture embraces unstructured data in several ways. Teradata Database and competitors can store and process XML, JSON, Avro and other forms of unstructured data.