high_scalability high_scalability-2010 high_scalability-2010-766 knowledge-graph by maker-knowledge-mining
Source: html
Introduction: With the success of Neo4j as a graph database in the NoSQL revolution, it's interesting to see another graph database, HyperGraphDB , in the mix. Their quick blurb on HyperGraphDB says it is a: general purpose, extensible, portable, distributed, embeddable, open-source data storage mechanism. It is a graph database designed specifically for artificial intelligence and semantic web projects, it can also be used as an embedded object-oriented database for projects of all sizes. From the NoSQL Archive the summary on HyperGraphDB is: API: Java (and Java Langs), Written in: Java , Query Method: Java or P2P, Replication: P2P , Concurrency: STM , Misc: Open-Source, Especially for AI and Semantic Web. So it has some interesting features, like software transactional memory and P2P for data distribution , but I found that my first and most obvious question was not answered: what the heck is a hypergraph and why do I care? Buried in the tutorial was: A HyperGraphD
sentIndex sentText sentNum sentScore
1 With the success of Neo4j as a graph database in the NoSQL revolution, it's interesting to see another graph database, HyperGraphDB , in the mix. [sent-1, score-0.834]
2 Their quick blurb on HyperGraphDB says it is a: general purpose, extensible, portable, distributed, embeddable, open-source data storage mechanism. [sent-2, score-0.148]
3 It is a graph database designed specifically for artificial intelligence and semantic web projects, it can also be used as an embedded object-oriented database for projects of all sizes. [sent-3, score-1.197]
4 So it has some interesting features, like software transactional memory and P2P for data distribution , but I found that my first and most obvious question was not answered: what the heck is a hypergraph and why do I care? [sent-5, score-0.333]
5 Buried in the tutorial was: A HyperGraphDB database is a generalized graph of entities. [sent-6, score-0.598]
6 The generalization is two-fold: Links/edges "point to" an arbitrary number of elements instead of just two as in regular graphs Links can be pointed to by other links as well. [sent-7, score-0.552]
7 OK, but I wish there was some explanation of why this is valuable. [sent-8, score-0.083]
8 What can I do with it that I can't do with normal graphs? [sent-9, score-0.069]
9 Given that there have been concerns over the complexity of the API this would seem a natural topic to cover. [sent-10, score-0.242]
10 I assume it's cool, it sounds cool, but I would like to know why :-) In any case it looks like an interesting product to take a look at. [sent-11, score-0.168]
wordName wordTfidf (topN-words)
[('hypergraphdb', 0.523), ('graph', 0.278), ('semantic', 0.224), ('dimitri', 0.158), ('embeddable', 0.158), ('misc', 0.148), ('blurb', 0.148), ('generalization', 0.136), ('buried', 0.132), ('links', 0.131), ('stm', 0.128), ('kicks', 0.123), ('thenosql', 0.12), ('java', 0.119), ('graphs', 0.119), ('projects', 0.118), ('database', 0.116), ('pointed', 0.111), ('generalized', 0.109), ('artificial', 0.106), ('answered', 0.103), ('heck', 0.1), ('extensible', 0.1), ('heterogeneous', 0.1), ('expanding', 0.097), ('portable', 0.096), ('tutorial', 0.095), ('concerns', 0.093), ('interesting', 0.092), ('ai', 0.092), ('cool', 0.091), ('elements', 0.088), ('api', 0.088), ('intelligence', 0.087), ('revolution', 0.087), ('arbitrary', 0.086), ('wish', 0.083), ('querying', 0.082), ('embedded', 0.079), ('natural', 0.076), ('assume', 0.076), ('method', 0.075), ('topic', 0.073), ('specifically', 0.073), ('sources', 0.071), ('obvious', 0.071), ('success', 0.07), ('purpose', 0.07), ('transactional', 0.07), ('normal', 0.069)]
simIndex simValue blogId blogTitle
same-blog 1 1.0000001 766 high scalability-2010-01-26-Product: HyperGraphDB - A Graph Database
Introduction: With the success of Neo4j as a graph database in the NoSQL revolution, it's interesting to see another graph database, HyperGraphDB , in the mix. Their quick blurb on HyperGraphDB says it is a: general purpose, extensible, portable, distributed, embeddable, open-source data storage mechanism. It is a graph database designed specifically for artificial intelligence and semantic web projects, it can also be used as an embedded object-oriented database for projects of all sizes. From the NoSQL Archive the summary on HyperGraphDB is: API: Java (and Java Langs), Written in: Java , Query Method: Java or P2P, Replication: P2P , Concurrency: STM , Misc: Open-Source, Especially for AI and Semantic Web. So it has some interesting features, like software transactional memory and P2P for data distribution , but I found that my first and most obvious question was not answered: what the heck is a hypergraph and why do I care? Buried in the tutorial was: A HyperGraphD
2 0.23487116 628 high scalability-2009-06-13-Neo4j - a Graph Database that Kicks Buttox
Introduction: Update: Social networks in the database: using a graph database . A nice post on representing, traversing, and performing other common social network operations using a graph database. If you are Digg or LinkedIn you can build your own speedy graph database to represent your complex social network relationships. For those of more modest means Neo4j , a graph database, is a good alternative. A graph is a collection nodes (things) and edges (relationships) that connect pairs of nodes. Slap properties (key-value pairs) on nodes and relationships and you have a surprisingly powerful way to represent most anything you can think of. In a graph database "relationships are first-class citizens. They connect two nodes and both nodes and relationships can hold an arbitrary amount of key-value pairs. So you can look at a graph database as a key-value store, with full support for relationships." A graph looks something like: For more lovely examples take a look at the Graph Image Gal
3 0.17546949 626 high scalability-2009-06-10-Paper: Graph Databases and the Future of Large-Scale Knowledge Management
Introduction: Relational databases, document databases, and distributed hash tables get most of the hype these days, but there's another option: graph databases. Back to the future it seems. Here's a really interesting paper by Marko A. Rodriguez introducing the graph model and it's extension to representing the world wide web of data. Modern day open source and commercial graph databases can store on the order of 1 billion relationships with some databases reaching the 10 billion mark. These developments are making the graph database practical for applications that require large-scale knowledge structures. Moreover, with the Web of Data standards set forth by the Linked Data community, it is possible to interlink graph databases across the web into a giant global knowledge structure. This talk will discuss graph databases, their underlying data model, their querying mechanisms, and the benefits of the graph data structure for modeling and analysis.
Introduction: On the surface nothing appears more different than soft data and hard raw materials like iron. Then isn’t it ironic , in the Alanis Morissette sense, that in this Age of Information, great wealth still lies hidden deep beneath piles of stuff? It's so strange how directly digging for dollars in data parallels the great wealth producing models of the Industrial Revolution. The piles of stuff is the Internet. It takes lots of prospecting to find the right stuff. Mighty web crawling machines tirelessly collect stuff, bringing it into their huge maws, then depositing load after load into rack after rack of distributed file system machines. Then armies of still other machines take this stuff and strip out the valuable raw materials, which in the Information Age, are endless bytes of raw data. Link clicks, likes, page views, content, head lines, searches, inbound links, outbound links, search clicks, hashtags, friends, purchases: anything and everything you do on the Internet is a valu
5 0.16182899 621 high scalability-2009-06-06-Graph server
Introduction: I've seen mentioned in few times sites like Digg or LinkedIn using graph servers to hold their social graphs. But the only sort of open source graph server I've found is http://neo4j.org/ . Can anyone recommend an open source graph server? Thanks Aaron
6 0.14505507 1088 high scalability-2011-07-27-Making Hadoop 1000x Faster for Graph Problems
7 0.14330953 1406 high scalability-2013-02-14-When all the Program's a Graph - Prismatic's Plumbing Library
8 0.14230844 805 high scalability-2010-04-06-Strategy: Make it Really Fast vs Do the Work Up Front
9 0.12938973 1136 high scalability-2011-11-03-Paper: G2 : A Graph Processing System for Diagnosing Distributed Systems
10 0.113905 827 high scalability-2010-05-14-Hot Scalability Links for May 14, 2010
11 0.11389025 459 high scalability-2008-12-03-Java World Interview on Scalability and Other Java Scalability Secrets
12 0.10666093 1530 high scalability-2013-10-11-Stuff The Internet Says On Scalability For October 11th, 2013
13 0.10655443 1064 high scalability-2011-06-20-35+ Use Cases for Choosing Your Next NoSQL Database
14 0.10080738 1285 high scalability-2012-07-18-Disks Ain't Dead Yet: GraphChi - a disk-based large-scale graph computation
15 0.097787373 954 high scalability-2010-12-06-What the heck are you actually using NoSQL for?
16 0.095758379 797 high scalability-2010-03-19-Hot Scalability Links for March 19, 2010
17 0.091833338 973 high scalability-2011-01-14-Stuff The Internet Says On Scalability For January 14, 2011
18 0.083730131 155 high scalability-2007-11-15-Video: Dryad: A general-purpose distributed execution platform
19 0.083553307 1460 high scalability-2013-05-17-Stuff The Internet Says On Scalability For May 17, 2013
20 0.081470825 842 high scalability-2010-06-16-Hot Scalability Links for June 16, 2010
topicId topicWeight
[(0, 0.123), (1, 0.051), (2, -0.008), (3, 0.052), (4, 0.04), (5, 0.099), (6, -0.055), (7, -0.018), (8, 0.026), (9, 0.066), (10, 0.044), (11, -0.011), (12, -0.042), (13, -0.069), (14, -0.044), (15, -0.054), (16, 0.008), (17, 0.145), (18, 0.063), (19, 0.079), (20, -0.144), (21, -0.089), (22, -0.026), (23, -0.039), (24, -0.04), (25, 0.045), (26, 0.007), (27, 0.007), (28, 0.039), (29, -0.015), (30, -0.058), (31, -0.018), (32, -0.005), (33, -0.043), (34, -0.006), (35, 0.092), (36, -0.0), (37, -0.01), (38, -0.012), (39, 0.01), (40, -0.006), (41, 0.001), (42, 0.023), (43, 0.011), (44, -0.014), (45, -0.003), (46, 0.061), (47, 0.01), (48, -0.027), (49, -0.004)]
simIndex simValue blogId blogTitle
same-blog 1 0.94550282 766 high scalability-2010-01-26-Product: HyperGraphDB - A Graph Database
Introduction: With the success of Neo4j as a graph database in the NoSQL revolution, it's interesting to see another graph database, HyperGraphDB , in the mix. Their quick blurb on HyperGraphDB says it is a: general purpose, extensible, portable, distributed, embeddable, open-source data storage mechanism. It is a graph database designed specifically for artificial intelligence and semantic web projects, it can also be used as an embedded object-oriented database for projects of all sizes. From the NoSQL Archive the summary on HyperGraphDB is: API: Java (and Java Langs), Written in: Java , Query Method: Java or P2P, Replication: P2P , Concurrency: STM , Misc: Open-Source, Especially for AI and Semantic Web. So it has some interesting features, like software transactional memory and P2P for data distribution , but I found that my first and most obvious question was not answered: what the heck is a hypergraph and why do I care? Buried in the tutorial was: A HyperGraphD
2 0.92703307 1406 high scalability-2013-02-14-When all the Program's a Graph - Prismatic's Plumbing Library
Introduction: At some point as a programmer you might have the insight/fear that all programming is just doing stuff to other stuff. Then you may observe after coding the same stuff over again that stuff in a program often takes the form of interacting patterns of flows. Then you may think hey, a program isn't only useful for coding datastructures, but a program is a kind of datastructure and that with a meta level jump you could program a program in terms of flows over data and flow over other flows. That's the kind of stuff Prismatic is making available in the Graph extension to their plumbing package ( code examples ), which is described in an excellent post: Graph: Abstractions for Structured Computation . You may remember Prismatic from previous profile we did on HighScalability: Prismatic Architecture - Using Machine Learning On Social Networks To Figure Out What You Should Read On The Web . We learned how Prismatic, an interest driven content suggestion service, builds programs in
3 0.91186202 628 high scalability-2009-06-13-Neo4j - a Graph Database that Kicks Buttox
Introduction: Update: Social networks in the database: using a graph database . A nice post on representing, traversing, and performing other common social network operations using a graph database. If you are Digg or LinkedIn you can build your own speedy graph database to represent your complex social network relationships. For those of more modest means Neo4j , a graph database, is a good alternative. A graph is a collection nodes (things) and edges (relationships) that connect pairs of nodes. Slap properties (key-value pairs) on nodes and relationships and you have a surprisingly powerful way to represent most anything you can think of. In a graph database "relationships are first-class citizens. They connect two nodes and both nodes and relationships can hold an arbitrary amount of key-value pairs. So you can look at a graph database as a key-value store, with full support for relationships." A graph looks something like: For more lovely examples take a look at the Graph Image Gal
4 0.90415722 626 high scalability-2009-06-10-Paper: Graph Databases and the Future of Large-Scale Knowledge Management
Introduction: Relational databases, document databases, and distributed hash tables get most of the hype these days, but there's another option: graph databases. Back to the future it seems. Here's a really interesting paper by Marko A. Rodriguez introducing the graph model and it's extension to representing the world wide web of data. Modern day open source and commercial graph databases can store on the order of 1 billion relationships with some databases reaching the 10 billion mark. These developments are making the graph database practical for applications that require large-scale knowledge structures. Moreover, with the Web of Data standards set forth by the Linked Data community, it is possible to interlink graph databases across the web into a giant global knowledge structure. This talk will discuss graph databases, their underlying data model, their querying mechanisms, and the benefits of the graph data structure for modeling and analysis.
5 0.87472475 1285 high scalability-2012-07-18-Disks Ain't Dead Yet: GraphChi - a disk-based large-scale graph computation
Introduction: GraphChi uses a Parallel Sliding Windows method which can: process a graph with mutable edge values efficiently from disk, with only a small number of non-sequential disk accesses, while supporting the asynchronous model of computation. The result is graphs with billions of edges can be processed on just a single machine. It uses a vertex-centric computation model similar to Pregel , which supports iterative algorithims as apposed to the batch style of MapReduce. Streaming graph updates are supported. About GraphChi, Carlos Guestrin, codirector of Carnegie Mellon's Select Lab, says : A Mac Mini running GraphChi can analyze Twitter's social graph from 2010—which contains 40 million users and 1.2 billion connections—in 59 minutes. "The previous published result on this problem took 400 minutes using a cluster of about 1,000 computers Related Articles Aapo Kyrola Home Page Your Laptop Can Now Analyze Big Data by JOHN PAVLUS Example Applications Runn
6 0.81379372 1136 high scalability-2011-11-03-Paper: G2 : A Graph Processing System for Diagnosing Distributed Systems
7 0.81000823 631 high scalability-2009-06-15-Large-scale Graph Computing at Google
8 0.80758339 805 high scalability-2010-04-06-Strategy: Make it Really Fast vs Do the Work Up Front
9 0.79147434 621 high scalability-2009-06-06-Graph server
10 0.75175983 801 high scalability-2010-03-30-Running Large Graph Algorithms - Evaluation of Current State-of-the-Art and Lessons Learned
11 0.74317288 155 high scalability-2007-11-15-Video: Dryad: A general-purpose distributed execution platform
12 0.72692299 827 high scalability-2010-05-14-Hot Scalability Links for May 14, 2010
13 0.7095269 1088 high scalability-2011-07-27-Making Hadoop 1000x Faster for Graph Problems
14 0.68913829 842 high scalability-2010-06-16-Hot Scalability Links for June 16, 2010
15 0.64681649 797 high scalability-2010-03-19-Hot Scalability Links for March 19, 2010
16 0.63767856 973 high scalability-2011-01-14-Stuff The Internet Says On Scalability For January 14, 2011
17 0.61968952 58 high scalability-2007-08-04-Product: Cacti
18 0.5986132 722 high scalability-2009-10-15-Hot Scalability Links for Oct 15 2009
19 0.58087856 1064 high scalability-2011-06-20-35+ Use Cases for Choosing Your Next NoSQL Database
20 0.56406087 1365 high scalability-2012-11-30-Stuff The Internet Says On Scalability For November 30, 2012
topicId topicWeight
[(1, 0.096), (2, 0.207), (61, 0.057), (77, 0.424), (79, 0.082), (94, 0.021)]
simIndex simValue blogId blogTitle
Introduction: Successful software design is all about trade-offs. In the typical (if there is such a thing) distributed system, recognizing the importance of trade-offs within the design of your architecture is integral to the success of your system. Despite this reality, I see time and time again, developers choosing a particular solution based on an ill-placed belief in their solution as a “silver bullet”, or a solution that conquers all, despite the inevitable occurrence of changing requirements. Regardless of the reasons behind this phenomenon, I’d like to outline a few of the methods I use to ensure that I’m making good scalable decisions without losing sight of the trade-offs that accompany them. I’d also like to compile (pun intended) the issues at hand, by formulating a simple theorem that we can use to describe this oft occurring situation.
Introduction: James Hamilton in Counting Servers is Hard has an awesome breakdown of what one million plus servers really means in terms of resource usage. The summary from his calculations are eye popping: Facilities: 15 to 30 large datacenters Capital expense: $4.25 Billion Total power: 300MW Power Consumption: 2.6TWh annually The power consumption is about the same as used by Nicaragua and the capital cost is about a third of what Americans spent on video games in 2012. Now that's web scale.
3 0.88450086 1116 high scalability-2011-09-15-Paper: It's Time for Low Latency - Inventing the 1 Microsecond Datacenter
Introduction: In It's Time for Low Latency Stephen Rumble et al. explore the idea that it's time to rearchitect our stack to live in the modern era of low-latency datacenter instead of high-latency WANs. The implications for program architectures will be revolutionary . Luiz André Barroso , Distinguished Engineer at Google, sees ultra low latency as a way to make computer resources, to be as much as possible, fungible, that is they are interchangeable and location independent, effectively turning a datacenter into single computer. Abstract from the paper: The operating systems community has ignored network latency for too long. In the past, speed-of-light delays in wide area networks and unoptimized network hardware have made sub-100µs round-trip times impossible. However, in the next few years datacenters will be deployed with low-latency Ethernet. Without the burden of propagation delays in the datacenter campus and network delays in the Ethernet devices, it will be up to us to finish
4 0.87450773 258 high scalability-2008-02-24-Yandex Architecture
Introduction: Update: Anatomy of a crash in a new part of Yandex written in Django . Writing to a magic session variable caused an unexpected write into an InnoDB database on every request. Writes took 6-7 seconds because of index rebuilding. Lots of useful details on the sizing of their system, what went wrong, and how they fixed it. Yandex is a Russian search engine with 3.5 billion pages in their search index. We only know a few fun facts about how they do things, nothing at a detailed architecture level. Hopefully we'll learn more later, but I thought it would still be interesting. From Allen Stern's interview with Yandex's CTO Ilya Segalovich, we learn: 3.5 billion pages in the search index. Over several thousand servers. 35 million searches a day. Several data centers around Russia. Two-layer architecture. The database is split in pieces and when a search is requested, it pulls the bits from the different database servers and brings it together for the user. Languages
same-blog 5 0.85454929 766 high scalability-2010-01-26-Product: HyperGraphDB - A Graph Database
Introduction: With the success of Neo4j as a graph database in the NoSQL revolution, it's interesting to see another graph database, HyperGraphDB , in the mix. Their quick blurb on HyperGraphDB says it is a: general purpose, extensible, portable, distributed, embeddable, open-source data storage mechanism. It is a graph database designed specifically for artificial intelligence and semantic web projects, it can also be used as an embedded object-oriented database for projects of all sizes. From the NoSQL Archive the summary on HyperGraphDB is: API: Java (and Java Langs), Written in: Java , Query Method: Java or P2P, Replication: P2P , Concurrency: STM , Misc: Open-Source, Especially for AI and Semantic Web. So it has some interesting features, like software transactional memory and P2P for data distribution , but I found that my first and most obvious question was not answered: what the heck is a hypergraph and why do I care? Buried in the tutorial was: A HyperGraphD
6 0.84202611 1195 high scalability-2012-02-17-Stuff The Internet Says On Scalability For February 17, 2012
7 0.82826644 525 high scalability-2009-03-05-Product: Amazon Simple Storage Service
8 0.82280898 959 high scalability-2010-12-17-Stuff the Internet Says on Scalability For December 17th, 2010
9 0.82178992 753 high scalability-2009-12-21-Hot Holiday Scalability Links for 2009
10 0.81697798 211 high scalability-2008-01-13-Google Reveals New MapReduce Stats
11 0.80579793 1531 high scalability-2013-10-13-AIDA: Badoo’s journey into Continuous Integration
12 0.76823324 1059 high scalability-2011-06-14-A TripAdvisor Short
13 0.7662822 439 high scalability-2008-11-10-Scalability Perspectives #1: Nicholas Carr – The Big Switch
14 0.75643867 1377 high scalability-2012-12-26-Ask HS: What will programming and architecture look like in 2020?
15 0.74231601 1158 high scalability-2011-12-16-Stuff The Internet Says On Scalability For December 16, 2011
16 0.7322309 612 high scalability-2009-05-31-Parallel Programming for real-world
17 0.72808743 1188 high scalability-2012-02-06-The Design of 99designs - A Clean Tens of Millions Pageviews Architecture
18 0.72779918 1571 high scalability-2014-01-02-xkcd: How Standards Proliferate:
19 0.70609087 212 high scalability-2008-01-14-OpenSpaces.org community site launched - framework for building scale-out applications
20 0.6966241 1567 high scalability-2013-12-20-Stuff The Internet Says On Scalability For December 20th, 2013