high_scalability high_scalability-2009 high_scalability-2009-628 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: Update: Social networks in the database: using a graph database . A nice post on representing, traversing, and performing other common social network operations using a graph database. If you are Digg or LinkedIn you can build your own speedy graph database to represent your complex social network relationships. For those of more modest means Neo4j , a graph database, is a good alternative. A graph is a collection nodes (things) and edges (relationships) that connect pairs of nodes. Slap properties (key-value pairs) on nodes and relationships and you have a surprisingly powerful way to represent most anything you can think of. In a graph database "relationships are first-class citizens. They connect two nodes and both nodes and relationships can hold an arbitrary amount of key-value pairs. So you can look at a graph database as a key-value store, with full support for relationships." A graph looks something like: For more lovely examples take a look at the Graph Image Gal
sentIndex sentText sentNum sentScore
1 Update: Social networks in the database: using a graph database . [sent-1, score-0.727]
2 A nice post on representing, traversing, and performing other common social network operations using a graph database. [sent-2, score-0.707]
3 If you are Digg or LinkedIn you can build your own speedy graph database to represent your complex social network relationships. [sent-3, score-0.958]
4 For those of more modest means Neo4j , a graph database, is a good alternative. [sent-4, score-0.591]
5 A graph is a collection nodes (things) and edges (relationships) that connect pairs of nodes. [sent-5, score-0.905]
6 Slap properties (key-value pairs) on nodes and relationships and you have a surprisingly powerful way to represent most anything you can think of. [sent-6, score-0.545]
7 They connect two nodes and both nodes and relationships can hold an arbitrary amount of key-value pairs. [sent-8, score-0.605]
8 So you can look at a graph database as a key-value store, with full support for relationships. [sent-9, score-0.67]
9 " A graph looks something like: For more lovely examples take a look at the Graph Image Gallery . [sent-10, score-0.652]
10 Here's a good summary by Emil Eifrem, founder of the Neo4j, making the case for why graph databases rule: Most applications today handle data that is deeply associative, i. [sent-11, score-0.719]
11 The most obvious example of this is social networking sites, but even tagging systems, content management systems and wikis deal with inherently hierarchical or graph-shaped data. [sent-14, score-0.378]
12 In essence, each traversal along a link in a graph is a join, and joins are known to be very expensive. [sent-16, score-0.791]
13 A graph database uses nodes, relationships between nodes and key-value properties instead of tables to represent information. [sent-19, score-1.215]
14 Instead of static and rigid tables, rows and columns, you work with a flexible graph network consisting of nodes, relationships and properties. [sent-28, score-0.781]
15 Disk-based, native storage manager completely optimized for storing graph structures for maximum performance and scalability. [sent-35, score-0.657]
16 Retrieving children is trivial in a graph database. [sent-44, score-0.591]
17 No need to flatten and serialize an object graph as graphs are native to a graph database. [sent-45, score-1.345]
18 a graph, you will have to retrieve the nodes for every traversal step (very fast) and then match them yourself in some manner (e. [sent-64, score-0.379]
19 This gives you almost unlimited flexibility regarding the layout of your data and domain object graph and very fast deep traversals (hops over several nodes) since they are handled natively by the Neo4j engine down to the storage layer and not your client code. [sent-68, score-0.873]
20 The drawback is that for huge data amounts (>1Billion nodes) the clustering and partitioning of the graph becomes non-trivial, which is one of the areas we are working on. [sent-69, score-0.724]
wordName wordTfidf (topN-words)
[('graph', 0.591), ('traversal', 0.2), ('relationships', 0.19), ('nodes', 0.179), ('relational', 0.125), ('social', 0.116), ('represent', 0.114), ('associative', 0.111), ('wikis', 0.102), ('graphs', 0.097), ('oo', 0.095), ('semantic', 0.092), ('tagging', 0.087), ('regarding', 0.083), ('database', 0.079), ('pairs', 0.078), ('hierarchical', 0.073), ('data', 0.072), ('native', 0.066), ('difficult', 0.066), ('hadoop', 0.065), ('hank', 0.064), ('anders', 0.064), ('graphsby', 0.064), ('managementby', 0.064), ('traversals', 0.064), ('wheeler', 0.064), ('xa', 0.064), ('flexibility', 0.063), ('properties', 0.062), ('comparison', 0.062), ('annotations', 0.061), ('databy', 0.061), ('drawback', 0.061), ('lovely', 0.061), ('slap', 0.061), ('thehacker', 0.061), ('persistent', 0.059), ('model', 0.058), ('speedy', 0.058), ('eifrem', 0.058), ('philosophical', 0.058), ('tx', 0.058), ('connect', 0.057), ('transactional', 0.057), ('networks', 0.057), ('handle', 0.056), ('databaseby', 0.056), ('depths', 0.056), ('opaque', 0.056)]
simIndex simValue blogId blogTitle
same-blog 1 0.99999994 628 high scalability-2009-06-13-Neo4j - a Graph Database that Kicks Buttox
Introduction: Update: Social networks in the database: using a graph database . A nice post on representing, traversing, and performing other common social network operations using a graph database. If you are Digg or LinkedIn you can build your own speedy graph database to represent your complex social network relationships. For those of more modest means Neo4j , a graph database, is a good alternative. A graph is a collection nodes (things) and edges (relationships) that connect pairs of nodes. Slap properties (key-value pairs) on nodes and relationships and you have a surprisingly powerful way to represent most anything you can think of. In a graph database "relationships are first-class citizens. They connect two nodes and both nodes and relationships can hold an arbitrary amount of key-value pairs. So you can look at a graph database as a key-value store, with full support for relationships." A graph looks something like: For more lovely examples take a look at the Graph Image Gal
2 0.42092043 626 high scalability-2009-06-10-Paper: Graph Databases and the Future of Large-Scale Knowledge Management
Introduction: Relational databases, document databases, and distributed hash tables get most of the hype these days, but there's another option: graph databases. Back to the future it seems. Here's a really interesting paper by Marko A. Rodriguez introducing the graph model and it's extension to representing the world wide web of data. Modern day open source and commercial graph databases can store on the order of 1 billion relationships with some databases reaching the 10 billion mark. These developments are making the graph database practical for applications that require large-scale knowledge structures. Moreover, with the Web of Data standards set forth by the Linked Data community, it is possible to interlink graph databases across the web into a giant global knowledge structure. This talk will discuss graph databases, their underlying data model, their querying mechanisms, and the benefits of the graph data structure for modeling and analysis.
3 0.36316985 621 high scalability-2009-06-06-Graph server
Introduction: I've seen mentioned in few times sites like Digg or LinkedIn using graph servers to hold their social graphs. But the only sort of open source graph server I've found is http://neo4j.org/ . Can anyone recommend an open source graph server? Thanks Aaron
Introduction: On the surface nothing appears more different than soft data and hard raw materials like iron. Then isn’t it ironic , in the Alanis Morissette sense, that in this Age of Information, great wealth still lies hidden deep beneath piles of stuff? It's so strange how directly digging for dollars in data parallels the great wealth producing models of the Industrial Revolution. The piles of stuff is the Internet. It takes lots of prospecting to find the right stuff. Mighty web crawling machines tirelessly collect stuff, bringing it into their huge maws, then depositing load after load into rack after rack of distributed file system machines. Then armies of still other machines take this stuff and strip out the valuable raw materials, which in the Information Age, are endless bytes of raw data. Link clicks, likes, page views, content, head lines, searches, inbound links, outbound links, search clicks, hashtags, friends, purchases: anything and everything you do on the Internet is a valu
5 0.32178891 1406 high scalability-2013-02-14-When all the Program's a Graph - Prismatic's Plumbing Library
Introduction: At some point as a programmer you might have the insight/fear that all programming is just doing stuff to other stuff. Then you may observe after coding the same stuff over again that stuff in a program often takes the form of interacting patterns of flows. Then you may think hey, a program isn't only useful for coding datastructures, but a program is a kind of datastructure and that with a meta level jump you could program a program in terms of flows over data and flow over other flows. That's the kind of stuff Prismatic is making available in the Graph extension to their plumbing package ( code examples ), which is described in an excellent post: Graph: Abstractions for Structured Computation . You may remember Prismatic from previous profile we did on HighScalability: Prismatic Architecture - Using Machine Learning On Social Networks To Figure Out What You Should Read On The Web . We learned how Prismatic, an interest driven content suggestion service, builds programs in
6 0.30661494 1088 high scalability-2011-07-27-Making Hadoop 1000x Faster for Graph Problems
7 0.29352927 805 high scalability-2010-04-06-Strategy: Make it Really Fast vs Do the Work Up Front
8 0.24407195 1136 high scalability-2011-11-03-Paper: G2 : A Graph Processing System for Diagnosing Distributed Systems
9 0.23487116 766 high scalability-2010-01-26-Product: HyperGraphDB - A Graph Database
10 0.22989227 1285 high scalability-2012-07-18-Disks Ain't Dead Yet: GraphChi - a disk-based large-scale graph computation
11 0.22738686 1064 high scalability-2011-06-20-35+ Use Cases for Choosing Your Next NoSQL Database
12 0.21527283 954 high scalability-2010-12-06-What the heck are you actually using NoSQL for?
13 0.2113574 827 high scalability-2010-05-14-Hot Scalability Links for May 14, 2010
14 0.19120276 842 high scalability-2010-06-16-Hot Scalability Links for June 16, 2010
15 0.18661219 722 high scalability-2009-10-15-Hot Scalability Links for Oct 15 2009
16 0.17772891 806 high scalability-2010-04-08-Hot Scalability Links for April 8, 2010
17 0.16831443 797 high scalability-2010-03-19-Hot Scalability Links for March 19, 2010
18 0.16088131 1530 high scalability-2013-10-11-Stuff The Internet Says On Scalability For October 11th, 2013
19 0.1502125 973 high scalability-2011-01-14-Stuff The Internet Says On Scalability For January 14, 2011
20 0.14337483 589 high scalability-2009-05-05-Drop ACID and Think About Data
topicId topicWeight
[(0, 0.203), (1, 0.108), (2, 0.022), (3, 0.065), (4, 0.068), (5, 0.19), (6, -0.077), (7, -0.031), (8, 0.034), (9, 0.135), (10, 0.164), (11, 0.027), (12, -0.104), (13, -0.135), (14, -0.035), (15, -0.041), (16, -0.022), (17, 0.315), (18, 0.112), (19, 0.159), (20, -0.256), (21, -0.091), (22, -0.002), (23, -0.088), (24, -0.091), (25, 0.084), (26, -0.036), (27, 0.063), (28, 0.086), (29, -0.075), (30, -0.058), (31, -0.046), (32, 0.013), (33, -0.037), (34, -0.037), (35, 0.113), (36, -0.006), (37, 0.007), (38, -0.023), (39, 0.023), (40, -0.015), (41, -0.03), (42, 0.057), (43, 0.022), (44, -0.019), (45, 0.007), (46, 0.046), (47, 0.014), (48, -0.012), (49, -0.024)]
simIndex simValue blogId blogTitle
same-blog 1 0.97648346 628 high scalability-2009-06-13-Neo4j - a Graph Database that Kicks Buttox
Introduction: Update: Social networks in the database: using a graph database . A nice post on representing, traversing, and performing other common social network operations using a graph database. If you are Digg or LinkedIn you can build your own speedy graph database to represent your complex social network relationships. For those of more modest means Neo4j , a graph database, is a good alternative. A graph is a collection nodes (things) and edges (relationships) that connect pairs of nodes. Slap properties (key-value pairs) on nodes and relationships and you have a surprisingly powerful way to represent most anything you can think of. In a graph database "relationships are first-class citizens. They connect two nodes and both nodes and relationships can hold an arbitrary amount of key-value pairs. So you can look at a graph database as a key-value store, with full support for relationships." A graph looks something like: For more lovely examples take a look at the Graph Image Gal
2 0.95966202 626 high scalability-2009-06-10-Paper: Graph Databases and the Future of Large-Scale Knowledge Management
Introduction: Relational databases, document databases, and distributed hash tables get most of the hype these days, but there's another option: graph databases. Back to the future it seems. Here's a really interesting paper by Marko A. Rodriguez introducing the graph model and it's extension to representing the world wide web of data. Modern day open source and commercial graph databases can store on the order of 1 billion relationships with some databases reaching the 10 billion mark. These developments are making the graph database practical for applications that require large-scale knowledge structures. Moreover, with the Web of Data standards set forth by the Linked Data community, it is possible to interlink graph databases across the web into a giant global knowledge structure. This talk will discuss graph databases, their underlying data model, their querying mechanisms, and the benefits of the graph data structure for modeling and analysis.
3 0.93740851 1406 high scalability-2013-02-14-When all the Program's a Graph - Prismatic's Plumbing Library
Introduction: At some point as a programmer you might have the insight/fear that all programming is just doing stuff to other stuff. Then you may observe after coding the same stuff over again that stuff in a program often takes the form of interacting patterns of flows. Then you may think hey, a program isn't only useful for coding datastructures, but a program is a kind of datastructure and that with a meta level jump you could program a program in terms of flows over data and flow over other flows. That's the kind of stuff Prismatic is making available in the Graph extension to their plumbing package ( code examples ), which is described in an excellent post: Graph: Abstractions for Structured Computation . You may remember Prismatic from previous profile we did on HighScalability: Prismatic Architecture - Using Machine Learning On Social Networks To Figure Out What You Should Read On The Web . We learned how Prismatic, an interest driven content suggestion service, builds programs in
4 0.93646169 1285 high scalability-2012-07-18-Disks Ain't Dead Yet: GraphChi - a disk-based large-scale graph computation
Introduction: GraphChi uses a Parallel Sliding Windows method which can: process a graph with mutable edge values efficiently from disk, with only a small number of non-sequential disk accesses, while supporting the asynchronous model of computation. The result is graphs with billions of edges can be processed on just a single machine. It uses a vertex-centric computation model similar to Pregel , which supports iterative algorithims as apposed to the batch style of MapReduce. Streaming graph updates are supported. About GraphChi, Carlos Guestrin, codirector of Carnegie Mellon's Select Lab, says : A Mac Mini running GraphChi can analyze Twitter's social graph from 2010—which contains 40 million users and 1.2 billion connections—in 59 minutes. "The previous published result on this problem took 400 minutes using a cluster of about 1,000 computers Related Articles Aapo Kyrola Home Page Your Laptop Can Now Analyze Big Data by JOHN PAVLUS Example Applications Runn
5 0.89250946 766 high scalability-2010-01-26-Product: HyperGraphDB - A Graph Database
Introduction: With the success of Neo4j as a graph database in the NoSQL revolution, it's interesting to see another graph database, HyperGraphDB , in the mix. Their quick blurb on HyperGraphDB says it is a: general purpose, extensible, portable, distributed, embeddable, open-source data storage mechanism. It is a graph database designed specifically for artificial intelligence and semantic web projects, it can also be used as an embedded object-oriented database for projects of all sizes. From the NoSQL Archive the summary on HyperGraphDB is: API: Java (and Java Langs), Written in: Java , Query Method: Java or P2P, Replication: P2P , Concurrency: STM , Misc: Open-Source, Especially for AI and Semantic Web. So it has some interesting features, like software transactional memory and P2P for data distribution , but I found that my first and most obvious question was not answered: what the heck is a hypergraph and why do I care? Buried in the tutorial was: A HyperGraphD
6 0.87714678 1136 high scalability-2011-11-03-Paper: G2 : A Graph Processing System for Diagnosing Distributed Systems
7 0.83873481 631 high scalability-2009-06-15-Large-scale Graph Computing at Google
8 0.83457184 805 high scalability-2010-04-06-Strategy: Make it Really Fast vs Do the Work Up Front
9 0.81628591 621 high scalability-2009-06-06-Graph server
10 0.80522531 801 high scalability-2010-03-30-Running Large Graph Algorithms - Evaluation of Current State-of-the-Art and Lessons Learned
11 0.77152401 1088 high scalability-2011-07-27-Making Hadoop 1000x Faster for Graph Problems
12 0.75947881 155 high scalability-2007-11-15-Video: Dryad: A general-purpose distributed execution platform
13 0.73532188 827 high scalability-2010-05-14-Hot Scalability Links for May 14, 2010
14 0.71377337 722 high scalability-2009-10-15-Hot Scalability Links for Oct 15 2009
15 0.67600089 842 high scalability-2010-06-16-Hot Scalability Links for June 16, 2010
16 0.67205769 58 high scalability-2007-08-04-Product: Cacti
17 0.60439342 797 high scalability-2010-03-19-Hot Scalability Links for March 19, 2010
18 0.59750736 973 high scalability-2011-01-14-Stuff The Internet Says On Scalability For January 14, 2011
19 0.59543484 1064 high scalability-2011-06-20-35+ Use Cases for Choosing Your Next NoSQL Database
20 0.57438189 542 high scalability-2009-03-17-IBM WebSphere eXtreme Scale (IMDG)
topicId topicWeight
[(1, 0.097), (2, 0.249), (10, 0.046), (17, 0.019), (30, 0.012), (40, 0.024), (47, 0.027), (48, 0.083), (49, 0.011), (51, 0.012), (61, 0.122), (77, 0.02), (79, 0.112), (85, 0.042), (91, 0.013), (94, 0.02), (99, 0.016)]
simIndex simValue blogId blogTitle
same-blog 1 0.96446091 628 high scalability-2009-06-13-Neo4j - a Graph Database that Kicks Buttox
Introduction: Update: Social networks in the database: using a graph database . A nice post on representing, traversing, and performing other common social network operations using a graph database. If you are Digg or LinkedIn you can build your own speedy graph database to represent your complex social network relationships. For those of more modest means Neo4j , a graph database, is a good alternative. A graph is a collection nodes (things) and edges (relationships) that connect pairs of nodes. Slap properties (key-value pairs) on nodes and relationships and you have a surprisingly powerful way to represent most anything you can think of. In a graph database "relationships are first-class citizens. They connect two nodes and both nodes and relationships can hold an arbitrary amount of key-value pairs. So you can look at a graph database as a key-value store, with full support for relationships." A graph looks something like: For more lovely examples take a look at the Graph Image Gal
2 0.95735466 873 high scalability-2010-08-06-Hot Scalability Links for Aug 6, 2010
Introduction: Twitter Sees Its 20 Billionth Tweet writes Marshall Kirkpatrick of ReadWriteWeb. Startups die for not having customers, so STOP thinking about how to scale . Alessandro Orsi says focusing on the architecture and scaling possibilities of your app for millions of users is just plain dumb...concentrate on marketing...concentrate on user experience . Alessandro is perfectly correct, but this isn't the year the 2000 when the default architecture that is easy is also not scalable and when sites were built from scratch one painful user at a time. Today neither is tue. In the era of social networks, where Facebook has 500 million users, successful applications can and often do spike to millions of users seemingly overnight. And you have to have some architecture. With today's tool-chains you don't have to choose easy and non-scalable. There are other options. Of course, it's all pointless without customers and that is what you need to worry about, but it's a false choice in this era to
3 0.95231456 672 high scalability-2009-08-06-An Unorthodox Approach to Database Design : The Coming of the Shard
Introduction: Update 4: Why you don’t want to shard. by Morgon on the MySQL Performance Blog. Optimize everything else first, and then if performance still isn’t good enough, it’s time to take a very bitter medicine. Update 3: Building Scalable Databases: Pros and Cons of Various Database Sharding Schemes by Dare Obasanjo. Excellent discussion of why and when you would choose a sharding architecture, how to shard, and problems with sharding. Update 2: Mr. Moore gets to punt on sharding by Alan Rimm-Kaufman of 37signals. Insightful article on design tradeoffs and the evils of premature optimization. With more memory, more CPU, and new tech like SSD, problems can be avoided before more exotic architectures like sharding are needed. Add features not infrastructure. Jeremy Zawodny says he's wrong wrong wrong. we're running multi-core CPUs at slower clock speeds. Moore won't save you. Update: Dan Pritchett shares some excellent Sharding Lessons : Size Your Shards, Use Math on Shard C
4 0.95077902 714 high scalability-2009-10-02-HighScalability has Moved to Squarespace.com!
Introduction: You may have noticed something is a little a different when visiting HighScalability today: We've Moved! HighScalability.com has switched hosting services to Squarespace.com. House warming gifts are completely unnecessary. Thanks for the thought though. It's been a long long long process. Importing a largish Drupal site to Wordpress and then into Squarespace is a bit like dental work without the happy juice, but the results are worth it. While the site is missing a few features I think it looks nicer, feels faster, and I'm betting it will be more scalable and more reliable. All good things. I'll explain more about the move later in this post, but there's some admistrivia that needs to be handled to make the move complete: If you have a user account and have posted on HighScalability before then you have a user account, but since I don't know your passwords I had to make new passwords up for you. So please contact me and I'll give you your password so you can login and change it.
5 0.94998056 306 high scalability-2008-04-21-The Search for the Source of Data - How SimpleDB Differs from a RDBMS
Introduction: Update 2: Yurii responds with the Top 10 Reasons to Avoid Document Databases FUD . Update: Top 10 Reasons to Avoid the SimpleDB Hype by Ryan Park provides a well written counter take. Am I really that fawning? If so, doesn't that make me a dear? All your life you've used a relational database. At the tender age of five you banged out your first SQL query to track your allowance. Your RDBMS allegiance was just assumed, like your politics or religion would have been assumed 100 years ago. They now say--you know them--that relations won't scale and we have to do things differently. New databases like SimpleDB and BigTable are what's different. As a long time RDBMS user what can you expect of SimpleDB? That's what Alex Tolley of MyMeemz.com set out to discover. Like many brave explorers before him, Alex gave a report of his adventures to the Royal Society of the AWS Meetup . Alex told a wild almost unbelievable tale of cultures and practices so different from our own you alm
6 0.94922 1074 high scalability-2011-07-06-11 Common Web Use Cases Solved in Redis
7 0.94850707 1020 high scalability-2011-04-12-Caching and Processing 2TB Mozilla Crash Reports in memory with Hazelcast
10 0.94695866 1186 high scalability-2012-02-02-The Data-Scope Project - 6PB storage, 500GBytes-sec sequential IO, 20M IOPS, 130TFlops
11 0.94675636 1112 high scalability-2011-09-07-What Google App Engine Price Changes Say About the Future of Web Architecture
12 0.94656205 961 high scalability-2010-12-21-SQL + NoSQL = Yes !
13 0.94633448 1395 high scalability-2013-01-28-DuckDuckGo Architecture - 1 Million Deep Searches a Day and Growing
14 0.94632232 589 high scalability-2009-05-05-Drop ACID and Think About Data
15 0.94613552 1017 high scalability-2011-04-06-Netflix: Run Consistency Checkers All the time to Fixup Transactions
16 0.94578141 825 high scalability-2010-05-10-Sify.com Architecture - A Portal at 3900 Requests Per Second
17 0.9457233 661 high scalability-2009-07-25-Latency is Everywhere and it Costs You Sales - How to Crush it
18 0.94559544 920 high scalability-2010-10-15-Troubles with Sharding - What can we learn from the Foursquare Incident?
19 0.94517159 1153 high scalability-2011-12-08-Update on Scalable Causal Consistency For Wide-Area Storage With COPS
20 0.94480562 1509 high scalability-2013-08-30-Stuff The Internet Says On Scalability For August 30, 2013