PAGE: A Partition Aware Engine for Parallel Graph Computation

 

Abstract:

Graph partition quality affects the overall performance of parallel graph computation systems. The quality of a graph partition is measured by the balance factor and edge cut ratio. A balanced graph partition with small edge cut ratio is generally preferred since it reduces the expensive network communication cost. However, according to an empirical study on Giraph, the performance over well partitioned graph might be even two times worse than simple random partitions. This is because these systems only optimize for the simple partition strategies and cannot efficiently handle the increasing workload of local message processing when a high quality graph partition is used. In this paper, we propose a novel partition aware graph computation engine named PAGE, which equips a new message processor and a dynamic concurrency control model. The new message processor concurrently processes local and remote messages in a unified way. The dynamic model adaptively adjusts the concurrency of the processor based on the online statistics. The experimental evaluation demonstrates the superiority of PAGE over the graph partitions with various qualities.

Algorithm:

 

  • Upload Algorithm.

Text File upload.

  • File Split Techniques.

Split File.

  • Graph partition, Edge Cut Ratio.

 Graph computation.

 


Key points:

  1. File Uploading.
  2. Encrypt key.
  3. File split.

EXISTING SYSTEM

A good balanced graph partition even leads to a decrease of the overall performance in existing systems. The Page Rank algorithm on six different partition schemes of a large web graph dataset, and apparently the overall cost of Page Rank per iteration increases with the quality improvement of different graph partitions. Lots of existing parallel graph systems are unaware of such effect of the underlying partitioned sub graphs, and ignore the increasing workload of local message processing when the quality of partition scheme is improved. Therefore, these systems handle the local messages and remote messages unequally and only optimize the processing of remote messages. The existing graph systems still cannot effectively utilize the benefit of high quality graph partitions.

PROPOSED SYSTEM

Since the message processing pipeline satisfied the producerconsumer model, several heuristic rules are proposed by considering the producer-consumer constraints. To balance the workload, the proposed strategies repartition the graph according to the online workload. Thus the quality of underlying graph partition changes along with repartitioning. To address this problem, we proposed a partition aware graph computation engine named PAGE that monitors three high-level key running metrics and dynamically adjusts the system configurations. In the adjusting model, we elaborated two heuristic rules to effectively extract the system characters and generate proper parameters.

Advantage

  • Easy to Upload files
  • Provide files.

System architecture

MODULE DESCRIPTION

MODULE

Case Study and Data Collection

  • User
  • Admin Authentication

MODULE DESCRIPTION

Case Study and Data Collection

We consider a case study of a web-based collaboration application for evaluating performance. The application allows users to store, manage, and share documents and drawings related to large construction projects. The service composition required for this application includes: Firewall (x1), Intrusion Detection (x1), Load Balancer (x1), Web Server (x4), Application Server (x3), Database Server (x1), Database Reporting Server (x1), Email Server (x1), and Server Health Monitoring (x1). To meet these requirements, our objective is to find the best Cloud service composition

USER

A balanced graph partition with small edge cut ratio is generally preferred since it reduces the expensive network communication cost. However, according to an empirical study on Graph, the performance over well partitioned graph might be even two times worse than simple random partitions. A good balanced partition (or high quality partition) usually has a small edge cut and helps improve the performance of systems. Because the small edge cut reduces the expensive communication cost between different sub graphs, and the balance property generally guarantees that each sub graph has similar computation workload.

Upload File

The user can upload the file to DB. And the Admin can allow the data to store the DB.

Split File

The user can upload the file to DB . The files are split and store the database.

Edge Cut Ratio

A balanced graph partition with small edge cut ratio is generally preferred since it reduces the expensive network communication cost.

Admin Authentication

Prominent examples include web graphs, social networks and other interactive networks in bioinformatics. The up to date web graph contains billions of nodes and trillions of edges. Graph structure can represent various relationships between objects, and better models complex data scenarios. The method consists of: reliability estimation, weak point recommendation and weak point strengthening steps, as defined by the overview. In the rest of this section, we briefly describe each of the stated steps.

Accept user

The admin can accept the new user request and also black the users.

Allow user file

The users can upload the file to cloud. And the admin can allow the files to cloud then only the file can store the cloud.