high_scalability high_scalability-2009 high_scalability-2009-630 knowledge-graph by maker-knowledge-mining

630 high scalability-2009-06-14-kngine 'Knowledge Engine' milestone 2

meta infos for this blog

Source: html

Introduction: Kngine is Knowledge Web search engine designed to provide meaningful search results, such as: semantic information about the keywords/concepts, answer the user’s questions, discover the relations between the keywords/concepts, and link the different kind of data together, such as: Movies, Subtitles, Photos, Price at sale store, User reviews, and Influenced story Goals Kngine long-term goal is to make all human beings systematic knowledge and experience accessible to everyone. I aim to collect and organize all objective data, and make it possible and easy to access. Our goal is to build on the advances of Web search engine, semantic web, data representation technologies a new form of Web search engine that will unleash a revolution of new possibilities. Kngine tries to combine the power of Web search engines with the power of Semantic search and the data representation to provide meaningful search results compromising user needs. Status Kngine starts as a research project in O

Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 I aim to collect and organize all objective data, and make it possible and easy to access. [sent-2, score-0.338]

2 Our goal is to build on the advances of Web search engine, semantic web, data representation technologies a new form of Web search engine that will unleash a revolution of new possibilities. [sent-3, score-1.329]

3 Kngine tries to combine the power of Web search engines with the power of Semantic search and the data representation to provide meaningful search results compromising user needs. [sent-4, score-1.307]

4 Status Kngine starts as a research project in October 2008. [sent-5, score-0.091]

5 Over times, I succeeded to collect, represent, and index a lot of human binges systematic knowledge but it is just the start. [sent-6, score-0.566]

6 Kngine knowledge base and capabilities already span a great number of domains, such as: 60,000+ Companies 700,000+ Movie 750,000+ Person 400,000+ Location 115,000+ Book About 5,000,000 concepts. [sent-8, score-0.293]

7 Future Kngine, as it exists today, is just the beginning. [sent-9, score-0.065]

8 I have both short- and long-term plans to dramatically expand all aspects of Kngine, qualities, broadening and deepening our data, and more. [sent-10, score-0.117]

9 I just released Kngine Milestone 2 (Our first public release), soon a preview of section called ‘Labs’ that will include a set of new research and technologies to access the knowledge will be presented. [sent-11, score-0.645]

10 Milestone 2 Milestone 2 is the firsts public release. [sent-12, score-0.089]

11 This release include some useful features that help the users to reach what they want directly, such as: Smart Information Answer your questions Link the data, and view direct data for more information about milestone 2 Go there . [sent-13, score-0.387]

similar blogs computed by tfidf model

tfidf for this blog:

wordName wordTfidf (topN-words)

[('kngine', 0.551), ('semantic', 0.25), ('milestone', 0.221), ('knowledge', 0.22), ('search', 0.219), ('systematic', 0.162), ('representation', 0.148), ('meaningful', 0.144), ('collect', 0.129), ('lyrics', 0.117), ('broadening', 0.117), ('datafor', 0.117), ('engine', 0.108), ('unleash', 0.101), ('alexander', 0.098), ('preview', 0.098), ('influenced', 0.098), ('compromising', 0.095), ('tower', 0.093), ('beings', 0.093), ('human', 0.093), ('research', 0.091), ('relations', 0.091), ('succeeded', 0.091), ('release', 0.09), ('public', 0.089), ('qualities', 0.088), ('sale', 0.083), ('goal', 0.082), ('movies', 0.081), ('alive', 0.076), ('include', 0.076), ('objective', 0.074), ('bank', 0.073), ('span', 0.073), ('october', 0.072), ('web', 0.072), ('technologies', 0.071), ('reviews', 0.07), ('aim', 0.07), ('represent', 0.069), ('dark', 0.067), ('results', 0.067), ('tries', 0.066), ('advances', 0.066), ('exists', 0.065), ('organize', 0.065), ('engines', 0.065), ('revolution', 0.065), ('user', 0.065)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.99999988 630 high scalability-2009-06-14-kngine 'Knowledge Engine' milestone 2

2 0.45525426 746 high scalability-2009-11-26-Kngine Snippet Search New Indexing Technology

Introduction: While Kngine just announce some improvement and new features , I would like you take you in small trip in Snippet Search research project at Kngine. What is Kngine? Kngine is startup company working in Searching technologies, We in Kngine aims to organize the human beings Systematic Knowledge and Experiences and make it accessible to everyone. We aim to collect and organize all objective data, and make it possible and easy to access. Our goal is to build Web 3.0 Web Search Engine on the advances of Web Search Engine, Semantic Web, Data Representation technologies a new form of Web Search Engine that will unleash a revolution of new possibilities. Introduction to Snippet Search Today, The Web Search Engine’s is the Web getaway, especially to get specific information. But unfortunately the search engines didn’t changed mush as the Web changed from 90’s. Since the 90’s the Web search engine still provide the same kind of results: Links to documents. We i

3 0.15163195 658 high scalability-2009-07-17-Against all the odds

Introduction: This article not about Mariah Carey, or its song. It's about Storing System, Database. First let's describe what means by odds: In my social network, I found 93% of the mainstream developers sanctify the database, or at least consider it in any data persistence challenge as the ultimate, superhero, and undefeatable solution. I think this problem come from the education, personally, and some companies also I think it's involved in this. To start to fix this bad thinking, we all should agree in the following points: Every challenge have its own solutions, so whatever you want to save/persistent, there are always many solutions. For example the Web search engines, such as: Google, Kngine, Yahoo, Bing don't use database at all instead we use Indexes (Index file) for better performance. The Database in general whatever the vendor it's slow compared with other solutions such as: Key-Value storing system, Index file, DHT. The Database currently employ Relation Data model

4 0.14894962 332 high scalability-2008-05-28-Job queue and search engine

Introduction: Hi, I want to implement a search engine with lucene. To be scalable, I would like to execute search jobs asynchronously (with a job queuing system). But i don't know if it is a good design... Why ? Search results can be large ! (eg: 100+ pages with 25 documents per page) With asynchronous sytem, I need to store results for each search job. I can set a short expiration time (~5 min) for each search result, but it's still large. What do you think about it ? Which design would you use for that ? Thanks Mat

5 0.097076245 269 high scalability-2008-03-08-Audiogalaxy.com Architecture

Introduction: Update 3: Always Refer to Your V1 As a Prototype . You really do have to plan to throw one away. Update 2: Lessons Learned Scaling the Audiogalaxy Search Engine . Things he should have done and fun things he couldn’t justify doing. Update: Design details of Audiogalaxy.com’s high performance MySQL search engine . At peak times, the search engine needed to handle 1500-2000 searches every second against a MySQL database with about 200 million rows. Search was one of most interesting problems at Audiogalaxy. It was one of the core functions of the site, and somewhere between 50 to 70 million searches were performed every day. At peak times, the search engine needed to handle 1500-2000 searches every second against a MySQL database with about 200 million rows.

6 0.095012389 1395 high scalability-2013-01-28-DuckDuckGo Architecture - 1 Million Deep Searches a Day and Growing

7 0.082150526 258 high scalability-2008-02-24-Yandex Architecture

8 0.077016443 899 high scalability-2010-09-09-How did Google Instant become Faster with 5-7X More Results Pages?

9 0.071218662 342 high scalability-2008-06-08-Search fast in million rows

10 0.070446655 810 high scalability-2010-04-14-Parallel Information Retrieval and Other Search Engine Goodness

11 0.069768913 834 high scalability-2010-06-01-Web Speed Can Push You Off of Google Search Rankings! What Can You Do?

12 0.069539741 856 high scalability-2010-07-12-Creating Scalable Digital Libraries

13 0.069415845 766 high scalability-2010-01-26-Product: HyperGraphDB - A Graph Database

14 0.0690099 801 high scalability-2010-03-30-Running Large Graph Algorithms - Evaluation of Current State-of-the-Art and Lessons Learned

15 0.067908019 584 high scalability-2009-04-27-Some Questions from a newbie

16 0.065769807 1233 high scalability-2012-04-25-The Anatomy of Search Technology: blekko’s NoSQL database

17 0.064832196 775 high scalability-2010-02-10-ElasticSearch - Open Source, Distributed, RESTful Search Engine

18 0.06461858 1435 high scalability-2013-04-04-Paper: A Web of Things Application Architecture - Integrating the Real-World into the Web

19 0.064415269 539 high scalability-2009-03-16-Books: Web 2.0 Architectures and Cloud Application Architectures

20 0.064086825 64 high scalability-2007-08-10-How do we make a large real-time search engine?

similar blogs computed by lsi model

lsi for this blog:

topicId topicWeight

[(0, 0.107), (1, 0.032), (2, 0.04), (3, -0.008), (4, 0.019), (5, 0.003), (6, -0.028), (7, 0.013), (8, 0.027), (9, 0.052), (10, 0.011), (11, -0.013), (12, -0.016), (13, -0.031), (14, 0.036), (15, 0.033), (16, -0.055), (17, 0.004), (18, 0.101), (19, -0.011), (20, 0.04), (21, -0.065), (22, 0.009), (23, 0.034), (24, -0.061), (25, -0.045), (26, -0.109), (27, 0.007), (28, -0.004), (29, 0.083), (30, -0.066), (31, 0.029), (32, -0.061), (33, 0.033), (34, 0.121), (35, -0.017), (36, -0.01), (37, 0.018), (38, -0.058), (39, -0.054), (40, 0.105), (41, 0.01), (42, -0.016), (43, 0.038), (44, -0.059), (45, 0.052), (46, -0.038), (47, 0.052), (48, 0.047), (49, -0.038)]

similar blogs list:

simIndex simValue blogId blogTitle

1 0.95050073 746 high scalability-2009-11-26-Kngine Snippet Search New Indexing Technology

same-blog 2 0.94719821 630 high scalability-2009-06-14-kngine 'Knowledge Engine' milestone 2

3 0.89752328 332 high scalability-2008-05-28-Job queue and search engine

4 0.85110635 342 high scalability-2008-06-08-Search fast in million rows

Introduction: I have a table .This table has many columns but search performed based on 1 columns ,this table can have more than million rows. The data in these columns is something like funny,new york,hollywood User can search with parameters as funny hollywood .I need to take this 2 words and then search on column whether that column contain this words and how many times .It is not possible to index here .If the results return say 1200 results then without comparing each and every column i can't determine no of results.I need to compare for each and every column.This query is very frequent .How can i approach for this problem.What type of architecture,tools is helpful. I just know that this can be accomplished with distributed system but how can i make this system. I also see in this website that LinkedIn uses Lucene for search .Is Lucene is helpful in my case.My table has also lots of insertion ,however updation in not very frequent.

5 0.81478143 246 high scalability-2008-02-12-Search the tags across all post

Introduction: Let suppose i have table which stored tags .Now user can enter keywords and i have to search through all the records in table and find post which contain tags entered by user .user can enter more than 1 keywords. What strategy ,technique i use to search fast .There maybe more than millions records and many users are firing same query. Thanks

6 0.76736856 258 high scalability-2008-02-24-Yandex Architecture

7 0.75192219 810 high scalability-2010-04-14-Parallel Information Retrieval and Other Search Engine Goodness

8 0.74646783 1601 high scalability-2014-02-25-Peter Norvig's 9 Master Steps to Improving a Program

9 0.72691262 899 high scalability-2010-09-09-How did Google Instant become Faster with 5-7X More Results Pages?

10 0.70592123 775 high scalability-2010-02-10-ElasticSearch - Open Source, Distributed, RESTful Search Engine

11 0.70123035 1395 high scalability-2013-01-28-DuckDuckGo Architecture - 1 Million Deep Searches a Day and Growing

12 0.64878738 269 high scalability-2008-03-08-Audiogalaxy.com Architecture

13 0.64649194 64 high scalability-2007-08-10-How do we make a large real-time search engine?

14 0.64202595 1253 high scalability-2012-05-28-The Anatomy of Search Technology: Crawling using Combinators

15 0.6048969 1295 high scalability-2012-08-02-Ask DuckDuckGo: Is there Anything you Want to Know About DDG?

16 0.5851506 1610 high scalability-2014-03-11-Douglas Adams - 3 Rules that Describe Our Reactions to Technologies

17 0.53639835 856 high scalability-2010-07-12-Creating Scalable Digital Libraries

18 0.53513622 689 high scalability-2009-08-28-Strategy: Solve Only 80 Percent of the Problem

19 0.53453356 1233 high scalability-2012-04-25-The Anatomy of Search Technology: blekko’s NoSQL database

20 0.53179926 335 high scalability-2008-05-30-Is "Scaling Engineer" a new job title?

similar blogs computed by lda model

lda for this blog:

topicId topicWeight

[(1, 0.048), (2, 0.098), (10, 0.036), (28, 0.353), (61, 0.235), (79, 0.107), (94, 0.014)]

similar blogs list:

simIndex simValue blogId blogTitle

same-blog 1 0.89227086 630 high scalability-2009-06-14-kngine 'Knowledge Engine' milestone 2

2 0.73690701 562 high scalability-2009-04-10-Facebook's Aditya giving presentation on Facebook Architecture

Introduction: Facebook's engg. director aditya talks about facebook architecture. How they use mysql, php and memcache. How they have modified the above to suit their requirements.

3 0.72388852 606 high scalability-2009-05-25-non-sequential, unique identifier, strategy question

Introduction: (Please bare with me, I'm a new, passionate, confident and terrified programmer :D ) Background: I'm pre-launch and 1 year into the development of my application. My target is to be able to eventually handle millions of registered users with 5-10% of them concurrent. Up to this point I've used auto-increment to assign unique identifiers to rows. I am now considering switching to a non-sequential strategy. Oh, I'm using the LAMP configuration. My reasons for avoiding auto-increment: 1. Complicates replication when scaling horizontally. Risk of collision is significant (when running multiple masters). Note: I've read the other entries in this forum that relate to ID generation and there have been some great suggestions -- including a strategy that uses auto-increment in a way that avoids this pitfall... That said, I'm still nervous about it. 2. Potential bottleneck when retrieving/assigning IDs -- IDs assigned at the database. My reasons for being nervous about

4 0.68248624 1294 high scalability-2012-08-01-Prismatic Update: Machine Learning on Documents and Users

Introduction: In update to Prismatic Architecture - Using Machine Learning on Social Networks to Figure Out What You Should Read on the Web , Jason Wolfe, even in the face of deadening fatigue from long nights spent getting their iPhone app out, has gallantly agreed to talk a little more about Primatic's approach to Machine Learning. Documents and users are two areas where Prismatic applies ML (machine learning): ML on Documents Given an HTML document:Â learn how to extract the main text of the page (rather than the sidebar, footer, comments, etc), its title, author, best images, etc determine features for relevance (e.g., what the article is about, topics, etc.) The setup for most of these tasks is pretty typical. Models are trained using big batch jobs on other machines that read data from s3, save the learned parameter files to s3, and then read (and periodically refresh) the models from s3 in the ingest pipeline. All of the data that flows out of the system can be

5 0.64421731 746 high scalability-2009-11-26-Kngine Snippet Search New Indexing Technology

6 0.64238656 1201 high scalability-2012-02-29-Strategy: Put Mobile Video Into Cold Storage After 30 Days

7 0.60303438 347 high scalability-2008-07-07-Five Ways to Stop Framework Fixation from Crashing Your Scaling Strategy

8 0.60078704 1395 high scalability-2013-01-28-DuckDuckGo Architecture - 1 Million Deep Searches a Day and Growing

9 0.59266877 930 high scalability-2010-10-28-NoSQL Took Away the Relational Model and Gave Nothing Back

10 0.58992094 793 high scalability-2010-03-10-Saying Yes to NoSQL; Going Steady with Cassandra at Digg

11 0.5893212 1287 high scalability-2012-07-20-Stuff The Internet Says On Scalability For July 20, 2012

12 0.58836472 1506 high scalability-2013-08-23-Stuff The Internet Says On Scalability For August 23, 2013

13 0.58742452 265 high scalability-2008-03-03-Two data streams for a happy website

14 0.58409548 739 high scalability-2009-11-09-10 NoSQL Systems Reviewed

15 0.58384949 1411 high scalability-2013-02-22-Stuff The Internet Says On Scalability For February 22, 2013

16 0.58074999 324 high scalability-2008-05-19-UK Based CDN

17 0.58007425 580 high scalability-2009-04-24-INFOSCALE 2009 in June in Hong Kong

18 0.57858986 173 high scalability-2007-12-05-Easier Production Releases

19 0.5769133 1303 high scalability-2012-08-13-Ask HighScalability: Facing scaling issues with news feeds on Redis. Any advice?

20 0.57516152 903 high scalability-2010-09-17-Hot Scalability Links For Sep 17, 2010