Wayne McAlpine Overview

Articles on Wayne McAlpine

Overview of Wayne McAlpine High Performance Computing (HPC) Research on Parallel Computation (CRPC)

Web spider BOT (Boris and Natalia)

The spider is working ( this is not a search eng )

Spiders the web
AP 13,000,000 per 24Hr
AP 541,666 per 1 Hr
AP 9,027 per 1min
AP 150 per 1Sec.

Indexed and writes data to disk
AP 4,000,000 per 24Hr
AP 166,666 per 1 Hr
AP 2,777 per 1min
AP 46 per 1Sec.

Re: Project Status
Date: November 18, 2002

It has been approximately ten days since the crawler system became operational, during which time I have added and changed some features in the system, namely; search capabilities and caching support. The crawler network came to fruition almost exactly as planned. Some minor changes are present in the network configuration. For example, we mount all of the crawler root file systems through NFS to ease system administration, and to provide a bit more redundancy. We have also installed dual 100MBIT NICs in each crawler machine, and segmented the network to increase throughput to the database server. Each crawler machine is also capable of an additional copper Gigabit connection as required.

All of the crawler machines are currently storing indexed site data on one uniprocessor database server attached to a redundant disk array. During the initial crawl we were utilizing 1-3 MBITs of downstream traffic full-time. We experienced no latency issues or upstream bandwidth issues during the crawl. I have also implemented dict caching on two of the crawler/search machines to determine the speed increases from cache support. The search speeds increased substantially, even when the cache root was mounted over NFS. The fastest caching we have tested to date is when the ndict tables are stored directly on disk, and I am experimenting with the possibility of a RAM based file system which would increase the reads again by several orders of magnitude. I have also not yet implemented actual query caching, and this should increase search performance dramatically as well.

The RDF parser for example has given us access to over 3.5 million cataloged and pre-indexed sites for our crawler seed. crawler seeding based on category profiling or simple word searches. As a result of these customizations, we can profile the initial seed of the crawler systems with the top keywords from Rawhide.

Since this system was designed as a 'crawler' system, and not a search engine, the performance numbers we have been getting, while impressive, are not representative of the final deliverable 'search-engine'. The current system for example is limited partially by the fact that the database is under constant insert traffic from the crawlers (although with 5 indexer threads per crawler, we were still getting 1-2 second search times.). I have also implemented a table optimization program that can further increase system performance by optimizing the dictionary tables in the database.

The system is also designed to be fairly fault-tolerant, as the search machines execute parallel queries to the other search machines on the network, and when one fails, the worst that happens is that the database data from the failed machine is unavailable. Because of the parallel querying capabilities of this system, it is also important to implement strict seeding of the crawler machines to prevent overlapping search results from different machines. This can slow down performance considerably, and result in queries with redundant results. As a performance safeguard, another program is being written to search the databases at scheduled periods to remove redundant URL's.

1. Top search terms from SearchBoss We require the terms that return actual results. The text files I have reviewed to date do not provide any meaningful information regarding successful searches occurring through the SearchBoss engine.

2. Actual traffic statistics from the SearchBoss service, in particular, peak traffic times and actual number of searches occurring on the service. If possible, I would like a traffic profile detailing the traffic levels during each day of the week. It is important that the traffic reflects searches though, not hits or page views.

3. Required format for the search data for Google's Back fill. Since we have changed the scope of the project from crawling with bulk uploads to an actual search engine, I will need a specification from Rawhide developers detailing how the search results need to be returned to the Google and SearchBoss system. Important details like connection methods, data formatting, search result limit protocols, query URL formatting, are all things that need to be provided immediately.

The Galleria - Innovation Place
#319-15 Innovation Blvd.
Saskatoon, SK Canada
S7N 2X8

Overview of Wayne McAlpine OEM Antivirus Software

BitDefender OEM Antivirus Software

BitDefender Antivirus

Under the agreement, OneWorld Office will distribute the entire line of BitDefender data security products for home, SOHO and SMB, including BitDefender Total Security 2009, which combines superior proactive protection from e-threats with instant messaging encryption and file vault for securely storing personal information or sensitive files, and PC Tune-up for the ultimate PC security.

Ginger Yerovsek, global sales director of BitDefender, said: By partnering with OneWorld Office, BitDefender is able to further enhance our reach throughout North America. This agreement allows us to provide consumers and SMBs with the most advanced, yet simple-to-use antivirus software and data security solutions available. We are very pleased with this partnership and look forward to a longstanding relationship moving forward.

Robert Siemons, director of sales at OneWorld Office, said: While the US economy is struggling and many companies are scarcely surviving, here at OneWorld Office, our strategic partnerships with well-managed, committed companies have allowed us to thrive even in uncertain markets. Our strategy has always been to associate ourselves with the best in any vertical we represent. A vertical that is indeed thriving at the moment is technical security and protection.


Kaspersky Lab OEM Antivirus Software

Kaspersky Lab Antivirus

G-Data OEM Antivirus Software

G-Data Antivirus

Panda OEM Antivirus Software

Panda Antivirus

ESET NOD32 OEM Antivirus Software

ESET NOD32 Antivirus

AVG OEM Antivirus Software

AVG Antivirus

Overview of Wayne McAlpine Data Recovery

Data recovery at Innovation Place

If you have ever dropped your laptop and then heard a horrible grinding sound when you turned it back on, you may know the agony of losing data. For many people, the effort to recover lost computer files begins and ends in frustration at the local computer repair shop where technicians use software-based recovery tools. But even for severely damaged hard drives there is another option that may be more affordable than you think.

Thanks to an ever-expanding partner network, that laptop, hard drive, or a whole host of removable media devices, could be packed up and shipped off to OneWorld Data Recovery's office and lab in the Concourse building at Innovation Place in Saskatoon.

" From my grandma needing help with her digital photos to IT professionals with a crashed server, everyone's data is valuable to them and we want to make it as easy as possible for them to get it back," says Wayne McAlpine, co-founder of OneWorld Data Recovery.

McAlpine and his partners have assembled a team of computer engineering professionals and given them the right tools, including a class 100 clean room; a highly controlled environment with minimal airborne dust or contaminants. OneWorld Data Recover y boasts a very high success rate in recovering data from electronic devices suffering from physical damage or logical damage such as deletions or corruptions. The process includes drive restoration, disk imaging and then data retrieval.

Of course physical damage, like dropping, is not the only way people lose data. Viruses and hackers can cause problems as well. So OneWorld Data Recover y has teamed up with four of the top ten antivirus developers in the world to market its newest product known as Total Data Protection. McAlpine says the idea is to build customer loyalty by providing good old fashioned customer service. " If it's complicated, nobody is going to use it. Our package includes an antivirus program, on- and off-site backup as well as data recover y software so it really is protection made easy." By providing an easy to use data protection package, McAlpine expects to build loyalty with computer users and computer consultants who will turn to OneWorld Data Recover y for help when serious issues arise. His goal is to drive down the cost of data recovery so it is accessible to everyone.

" Data recovery is like a black art to a lot of people. They have no idea how it's done and often just give up without exploring all their options," says McAlpine, who is currently working with others in the industry to develop a certification program for data recovery technicians. He says industry standards will help to bring costs down and build people's confidence in the system.

The new Total Data Protection package is available at retailers across the country. For more information about data recovery go to www.oneworlddatarecovery.com.

Articles on OneWorld TDP

Data recovery Panda Total Data Protection

OneWorld Data Recovery, physical damage or logical damage such as deletion or corruption we will salvage your important data.
OneWorld Total Data Protection is a proactive and effective solution for company or home use dedicated to your peace of mind. It is Protection Made Easy.