Express servers which contain eight quad core processors. So each server contains thirty two processor cores and there are ninety servers to combine to make the two thousand eight hundred and eighty processor cores that make up Watson brain . So what would you define Watson to be? A midrange computer? A mainframe computer? Or a Supercomputer?
I would define Watson to be a Supercomputer since t was designed for one purpose, can do more than Just rapid mathematical computations. Watson has thousands of central processing units. It can perform computation-intensive applications and has a massive amount of online and offline storage . The ability to coordinate all of these processors into one functioning logarithmic unit required a group of engineers from IBM to develop a specialized kernel-based virtual machine implementation with the ability to process eighty Terra- flops per seconds .
The software that allowed all of this to occur Is called Apache Hoodoo. Hoodoo is an open source framework software that Is used to organize and rent technology is set at a central processing unit (CPU) clock speed of three gaga- hertz, a software model to enhance parallel processing for supercomputers had to be developed. With the use of Hoodoo the programmers at IBM were able to more easily write applications for Watson that benefited and took advantage of parallel processing to increase the speed at which problems could be solved and questions could be answered.
The main reason why this makes things faster is the fact that one question can be researched in multiple paths at one time using parallel processing paths . This large amount of processors gives Watson the ability to answer a question that is designed to be answered by the specific programming that has been entered in three seconds, if you doubled the amount of processors you could but the time to within a second and a half, tripling the number of processors could make it possible to answer a question in under a second.
Would this be theoretically necessary? Most likely not, but it is a demonstration of what the IBM Watson Supercomputer is capable of. The Watson Supercomputer utilized ninety IBM Power 750 Express Servers that cost around $35,000 each. This meant that Watson cost over $3. 15 million before the data storage, 1/0 networking and installation was factored in. Is this possible for many large corporations, hospitals and research institutions? Most likely yes, but what if smaller companies or small clinics and hospitals wanted to take advantage of this technology?
Studies showed that with the use of the programming that runs the Watson Supercomputer you could use as few as one IBM Power 750 Express Servers and only have to wait longer for the answer that you would receive. For the use of one server is could take up to 2 ours, but with the use of nine servers you could have an answer in 30 seconds. This provides the scalability to make the Watson Supercomputer concept a viable alternative to many companies problem solving needs . Watson was built with sixteen terabytes of random access memory (RAM).
This RAM is spread throughout the ninety servers that compose Watson, this means that each server was equipped with around 18 gigabytes of RAM . But only specialized software developed by IBM could make it possible for all of this RAM spread across ninety different servers work together. Again IBM the software that allowed all of this to occur is called Apache Hoodoo. Hoodoo is an open source framework software that is used to organize and manage grid computing environments. Hoodoo was designed with an integrated distributed file system that allowed all operating data to be loaded into memory .
The use of this massive amount of RAM gives Watson the ability to have the data required to ask questions at his/her fingertips. If the computer is pulling the data from memory or RAM the access time to find and transfer the data is measured in micro seconds or almost millionths of a second. If the computer is required to pull data from a hard drive or even a network drive the access time would be measured in milliseconds for a hard drive or thousandths of a second. The access time for a network drive could be measured in the seconds according to the amount of data being requested.
If you look at the data presented, even having access to data in microseconds can allow a computational process to be completed within a fraction of the time required . This is what allowed Watson to demolish the two greatest Jeopardy champions ever. Where does Watson solution that is a modified IBM JONAS cluster with a total of 21. TAB of raw capacity. Upon boot up, Watson loads all of the data stored into the 16 gigabytes of RAM that it has. This data for the Jeopardy consisted of the Wisped database (It was a valid source for MOM), millions of books, song lyrics, and other writings.
Reports are that all of this data totaled only one terabyte, which they (MOM) claim you could fit on a universal serial bus (USB) drive that you could buy at your local electronics store . This data is what Watson was programmed to load for this particular application, the commercial uses for Watson are being pushed for the financial industry and hospitals . So would doubling the amount of RAM that Watson has allow it to operate even more efficiently? This is going to be dependent on the application that the system is being used for.
The total amount of required data can all be loaded into memory is the only question that you need to be able to answer yes to each and every time. This is because any outside requests for data slows Watson down and makes him/her less efficient . IBM Watson used Juniper switches running at ten gigabits per second Ethernet (Gibe) speeds. During the Jeopardy experiment Watson was not connected to the internet. Instead Watson used the Ethernet links or the IBM POWERS servers to talk to each other, and to access files over the Network File System (NFG) protocol to the internal customized JONAS storage 1/0 nodes .
What is a Juniper switch? It consists of one IBM JOE (EXCESS) switch populated with fifteen ten gigabit per second line cards and one gigabit per second line card, as well as three IBM JOE (EXCESS) switches installed into a virtual chassis configuration. These switches are all running Juniper's Juncos network operating system. This operating systems enables up to ten IBM JOE switches to be configured in a single virtual chassis configuration. A virtual chassis is a flexible, high scaling switch solution that allows several switches to form one unit as if it were within a single chassis.
The switches operate together over a 128 gigabit per second backplane with a scalability of up to 480 access ports, 2. 4 terabyte per second fabric, transferring over 2 billion packets-per-second (pops) to as many as 6,000 servers in a single domain, If the CPU cores of Watson are the calculating part of the brain with the RAM being its memory, then the Juniper switch would be its nervous system getting the data from one place to another be measured in nanoseconds or billionths of a second . What is a JONAS? Scale-out Network-Attached Storage (JONAS) is a clustered NAS system that has separate interface and storage nodes.
JONAS uses an IP network with the NFG, Common Internet File System (CIFS) and Secure Copy Protocol (SSP). The JONAS clusters are designed to be built and marketed into nodes that range from twenty-seven to four hundred and eighty terabytes using either Serial Advance Technology Attachment (SAT) or Serial Attached Small Computer Interface (ASS) drives . The main application that tackles taking a human question, evaluating the meaning and producing an answer for Watson is Deepen. This is a large scale problem solver that utilizes parallel processing to look for answers in multiple paths.
To produce a solution Deepen uses an analysis, hypothesis, filtering, and scoring technique to eliminate false data and produce the most relevant answer to a question. The things that Deepen cannot perform are listed below: 1. Questions that questions that contain a video or audio clue to solve it. 3. If a question contains multiple clues that would have to be solved to answer the initial question. 4. When the answer is a combined interpretation of two separate clues or if the answers hymen and this analysis would be required to solve the question. . Questions that have multiple answers and one answer is more correct than the others. The inability to perform these functions shows that the artificial intelligence that has been developed by IBM for the Watson Supercomputer is not able to function the same as a human being as the function must be programmed for the specific environment that Watson is going to operate in and it cannot interrupt questions that are not forthright, contain audio or video, require interpretation or thinking between the lines per say .
Some of the things that Watson can do are built on the foundations of content acquisition, question analysis, hypothesis generation, hypothesis evidence and scoring, and ranking and confidence estimation. Without all of these components then the ability for Watson to do what he/she does would be impossible . Content acquisition allows Watson to identify sources and scope nuggets to identify more information. Question analysis attempts to understand what is being asked and direct the other components to perform an initial analyses to help determine how to process the question.
Hypothesis generation produces candidate answers by searching the results of the question analysis and chopping out nuggets of data that are then reanalyzes by the process to validate it as an answer and produce a score which is used in the evidence and ranking process. The final ranking and confidence estimation is performed by the algorithmic formula that basis the score on previously learned possibility for a suspect answer being correct or incorrect .
The Watson supercomputer runs on USE Linux Enterprise Server 11 0 / S that provides advanced memory management, multiple processor type support, and unmatched performance on systems with multiform processors, native Portable Operating System Interface (POSIX) Thread Library (INPUT), advanced multi-bathing and 1/0 capabilities. Tested benchmarks and testing have shown that USE Linux out performs any other operating system on the Powers Server built by IBM .
Other items that make USE such a valuable choice for Watson is the fact that this is a commercial venture that needs to interoperable with other operating systems often used in today's data centers. USE is able to operate with both Windows and UNIX seamlessly for a mixed infrastructure. The ability to operate many different applications would be critical to allow Watson to become a commercial success. With the installation of USE Linux Enterprise Server you have the ability to run mission- critical databases, e-commerce applications and ERP systems, e-mail, file and print servers and web servers.
This would make Watson more than an expensive question and answer machine . What is Unstructured Information Management applications (JIM)? These are software systems that look at large, unstructured volumes of information to interpret the data that is relevant to the end user. This is important in the Watson supercomputer as it breaks down the parts of the question that is submitted to the Deepen processing application.