In this article I show a systematic knowledge discovery method utilizing web search mechanisms. In the first section I outline the main difference between the power of knowledge and limitation of information. In the second section I suggest a systematic web search process. In the third section I try to outline when the proposed systematic search may be a wise choice.
Section 1: Why knowledge is better than information.
Through ages we managed to survive due to our knowledge. Knowledge is the skill that helps us predict future events, plan behavior and develop tools to support the behavior. Power is based on knowledge and makes use of knowledge; on the other hand, power reproduces knowledge by shaping it in accordance with its anonymous intentions. The ability of knowledge to shape environment and provide a competitive edge is the main reason why it is so valued by people. In this sense knowledge has to be useful and verified for its prediction powers to work. Knowledge in itself is a powerful drug. An article in recent American Scientist offered an interesting theory of why some people are driven to find knowledge – because of the kick of natural opioids in the brain.
Epistemology or theory of knowledge is the branch of philosophy concerned with the nature and scope (limitations) of knowledge. It addresses the questions:
- What is knowledge?
- How is knowledge acquired?
- What do people know?
- How do we know what we know?
- Why do we know what we know?
In this research rather than answering the profound questions, we concentrate on difference between information and knowledge with respect to the data that may be found on the internet.
How do we gain knowledge? Mainly through information observations and experiences. Information is any type of pattern that influences the formation or transformation of other patterns. In this sense, there is no need for a conscious mind to perceive, much less appreciate, the pattern. Information does not have to be useful, true or meaningful. People pursue information when they lack knowledge. Stewart (2001) argues that the transformation of information into knowledge is a critical one, lying at the core of value creation and competitive advantage for the modern enterprise. Tools and processes used to assist a knowledge worker in performing research and making decisions, including steps such as:
- reviewing information in order to effectively derive value and meaning
- referencing metadata if any is available
- establishing a relevant context, often selecting from many possible contexts
- deriving new knowledge from the information
- making decisions or recommendations from the resulting knowledge.
Why information itself is not sufficient? We are bombarded with information. Wikipedia’s notability guidelines state that a subject is notable if there are multiple reliable sources independent of the subject. This would make one naturally assume that if one who were writing an article floods it with sources, it should be safe. Bombardment is good when each source has a lot of information of its own. Since one of the purposes of references is to provide additional knowledge, providing more sources of information is a good thing. Bombardment is a nuisance when the sources are identical to one another or otherwise redundant. Unverified information has further darker sides. Weakness, lies, and unsuccessfulness go very well together. They each lead to one another, and you generally want to avoid all three. Situated knowledge is knowledge specific to a particular situation. Some methods of generating knowledge, such as trial and error, or learning from experience, tend to create highly situational knowledge. In other words relying seriously on unfiltered stream of information may produce highly specific, irrelevant or false knowledge.
What happens when we are exposed to information rather than knowledge? Information laziness or information fatigue results from continuous exposure to information. In other words, exposure to information that is not bound to the main body of personal knowledge causes information overload. Information overload is a term coined by Alvin Toffler which refers to an excess amount of information being provided, making processing and absorbing tasks very difficult for the individual because sometimes we cannot see the validity behind the information. This hinders decision-making and judgment by causing stress and cognitive impediments such as confusion, uncertainty and distraction. It comes from all sources including TV, newspapers, magazines as well as wanted and unwanted regular e-mail and faxes. It has been exacerbated enormously because of the formidable number of results obtained from web search engines.
Section 2: Systematic web search.
I think, that the key for discovery of knowledge rather than information is a method. Like scientific method helped to make order in the physical world around us, proper knowledge discovery method may make order in the body of information available on the web. The scientific method includes some of the following steps:
- Define the question
- Gather information and resources
- Form hypothesis
- Perform experiment and collect data
- Analyze data
- Interpret data and draw conclusions that serve as a starting point for new hypothesis
- Publish results and get feedback
- Retest collaboratively
- Feedbacks and updates
Let us try to define a search process around this method. The first step would be defining the topic. For this purpose we may first write down what we want to accomplish. Then we use directories, suggestion engines, disambiguation and synonyms tools, to define a set of questions. In this research I formed the question “Why knowledge is better than information” before I started gathering resources.
The second step would involve gathering a large amount of relevant paragraphs from various sources. The purpose of this step is gathering wider field of keywords, concepts, sources and patterns to enable a deeper search. In this research I parsed through various wiki sources using the words “information” and “knowledge” until I got the phrase “knowledge is power” and the following graph:
The third step is crucial for the success of knowledge generation. In fact it is the stop point of the previous step of information gathering and the start point of systematic search. In this step the researcher writes down what sort of information he is looking for. In this research I formulated something like “If knowledge induces power due to prediction, information induces weakness due to overload. This difference is similar to the difference between scientific and situational knowledge”.
Sometimes it is hard to get from the hypothesis to systematic data collection. Crowdsourcing or other collaboration process may help by providing perspectives from other, potentially more experienced researchers. “This is what I want to find, where do I start” is a common questions people ask when dealing with complex search. Usually experienced researches have very simple and food answers.
The fourth step involves systematic data collection. The research hypothesis is broken into several keywords combinations, which provide large body of relevant paragraphs. Each paragraph is used to find related documents, keywords and paragraphs. Theoretically, this process may be infinite. In practice, the information starts to repeat itself very soon, forming information bombardment issues. At some point it becomes very hard to progress through the information bombardment. At this point the analysis starts.
The fifth step involves search for interesting patterns in the body of found information. These patterns may include phrases that repeat often, rare phrases, which suddenly appear, most cited documents etc. Once the patterns of interest have been found, these patterns may describe very well the bulk of knowledge found afterwards.
The sixth step involves writing down the analysis data as a research report. In this research we wrote down the text of the research report during the analysis step. The blanks in the research argumentations were filled through additional hypothesis and research, so that the argumentation patterns became well connected into one “picture”. In this research we used basic notions of epistemology to define the flow of the research argumentation.
The seventh step includes publishing the results. Detailed research report and collaboration over research results help to refine knowledge and eliminate assumptions on person is oblivious to. In fact a detailed public report is an invitation to the eighth step.
The eighth step involves collaborative retesting of knowledge. In fact, knowledge is useful in predicting the future result. It is hard for one person to test many possible methods, situations and limitations. For once, it is very hard psychologically to find problems with a theory we really like. My experience shows that only collaborative analysis based on public research report may validate or invalidate knowledge.
The ninth step involves getting feedbacks and updates. It is reasonable that we will want to repeat the searches that interested us: there is always new data available, there are reactions to our publication,
What are the main differentiators between the proposed systematic search and commonly used casual search?
- The systematic search includes several steps of refining key questions and keywords
- The systematic search includes report of the search process, and can be repeated, updated and criticized accordingly.
- The systematic search includes collaborative elements, since it is hard for one person to take into account all the aspects of a problem.
Section 3: Beyond knowledge
In the previous section we outlined a labor-intensive process for generating what we believe to be true and verified knowledge over the web. Who needs this process? It is clear that if the consequences of mistake are very small, a simple web search method may be used. Another example when the simple search is sufficient, is the case when a satisfying answer appears in highly reliable source, such as wikipedia. Here I will try to outline several cases when structured search is required:
- Scientific research. It goes without saying that any scientific research should be based on some sort of scientific method. The web is probably the most readily available information source for many philosophical and social studies. The potential of the wisdom of the crowd is yet to be exploited.
- Professional search. Search for market information, business intelligence, patent search, search for dedicated equipment, search for algorithmic solutions, search performed by journalists and some other professional search are vital for success of the researcher and his/hers organization, therefore method limiting mistakes is crucial.
- Search by information-challenged individuals. While the information is readily available, the know-how of the search process is not. People suffering from dyslexia, ADD, people without sufficient computer skills and linguistic capabilities maybe unable to find the information using regular search process. Structured search process and using wisdom of the crowds may reduce the amount of errors and improve quality of life for such people.
- Search out of curiosity. People tend to become experts in various areas, due to curiosity or fixation. Fans want to know everything about celebrities, housewives need new recipes, sick people are looking for comfort and remedy. In this case we deal with repeating and deepening search and understanding, which may benefit from the systematic search method.
- Search as training. When educating people for research work, it is reasonable to conduct structured search which can be readily evaluated. Simple search enables information laziness in the education systems. Rather than systematically looking for the knowledge, students pull the first reasonable answer or alternatively state that such body of knowledge is unavailable. Detailed, structured and documented search may minimize information laziness, since search process itself becomes available for validation.
How should we use web-based knowledge? Wisdom is associated with the ability to apply knowledge. In this case I refer to wisdom of the clouds. Opening systematic and “scientific” search to crowds will enable new methods of knowledge body creation (akin wikipedia) and lead to new implementations of the knowledge, which I currently cannot even imagine.
- Knowledge is power since is enable to predict the environment
- Information may be transformed into knowledge via careful research
- Information that is not transformed into knowledge causes fatigue and limits the decision-making abilities.
- Systematic search for knowledge on the web may be mimicked after the scientific process
- The key elements of the systematic search are refinement of the questions, report and collaboration.
The systematic search may be useful for search professionals and amateurs, as long as the expected result of such search is knowledge rather than information.