Index Terms Google App Engine; Cloud Computing;PaaS, Karwendel

Download 35.3 Kb.
NameIndex Terms Google App Engine; Cloud Computing;PaaS, Karwendel
A typeDocumentation

Google AppEngine working, performance and efficient usage

1Digender Mahara, 2 Prof. Megha Wankhade 1MCA Student of Mumbai University, Mumbai, MH-India, 2Associate Professor, MCA Department, Mumbai University, Mumbai, MH-India

This paper provides the working on Google App Engine which is a “Platform as a Service” (PaaS) technology.Since GAE hosts Web applications on Google’s large-scale server infrastructure provides applications with scalable resources and lot more features. This paper also gives an introduction to Google App Engine, its features and limitations. It also presents Montecarlo Simulation test on Google App Engine Computing which specifies its strengths over Karwendel System which presents its performance. It also provides few points regarding efficient usage of Google App Engine and how it can we utilized to its fullest.
Index Terms-- Google App Engine; Cloud Computing;PaaS, Karwendel.
GAE (Google App Engine) hosts Web applications on Google’s large-scale server infrastructure. It has three main components: scalable services, a runtime environment, and a data store.GAE’s front-end service handles HTTP requests and maps them to the appropriate application servers. Application servers start, initialize, and reuse application instances for incoming requests. During traffic peaks, GAE automatically allocates additional resources to start new instances. The number of new instances for an application and the distribution of requests depend on traffic and resource use patterns. So, GAE performs load balancing and cache management automatically. Each application instance executes in a sandbox (a runtime environment abstracted from the underlying operating system). This prevents applications from performing malicious operations and enables GAE to optimize CPU and memory utilization for multiple applications on the same physical machine. Sandboxing also imposes various programmer restrictions:

• Applications have no access to the underlying hardware and only limited access to network facilities.

• Java applications can use only a subset of the standard library functionality.

• Applications can’t use threads.

• A request has a maximum of 30 seconds to respond to the client. GAE applications use resources such as CPU time, I/O bandwidth, and the number of requests within certain quotas associated with each resource type.

The CPU time is, in fuzzy terms, equivalent to the number of CPU cycles that a 1.2-GHz Intel x86 processor can perform in the same amount of time. Information on the resource usage can be obtained through the GAE application administration Web interface. Finally, the data store lets developers enable data to persist beyond requests. The data store can’t be shared across different slave applications.
II. GOOGLE APPENGINE A “Parallel Computing Framework”.

To support the development of parallel applications with GAE, we designed a Java-based generic framework. Implementing a new application in our framework requires specialization for three abstract interfaces (classes): JobFactory, WorkJob, and Result. The master application is a Java program that implements JobFactory on the user’s local machine. JobFactory manages the algorithm’s logic and parallelization in several WorkJobs. WorkJob is an abstract class implemented as part of each slave application—in particular, the run method, which executes the actual computational job. Each slave application deploys as a separate GAE application and, therefore, has a distinct URI. The slave applications provide a simple HTTP interface and accept either data requests or computational job requests.
The HTTP message header stores the type of request. A job request contains one WorkJob that’s submitted to a slave application and extended. If multiple requests are submitted to the same slave application, GAE automatically starts and manages multiple instances to handle the current load; the programmer doesn’t have control over the instances. (One slave application is, in theory, sufficient; however, our framework can distribute jobs among multiple slave applications to solve larger problems.) A data request transfers data shared by all jobs to the persistent data store. It uses multiple parallel HTTP requests to fulfill the GAE’s maximum HTTP payload size of 1 Mbyte and improve bandwidth utilization.
Mapping WorkJobs to resources follows a dynamic work pool approach that’s suitable for slaves running as black boxes on sandboxed resources with unpredictable execution times. Each slave application has an associated job manager in the context of the master application. It requests WorkJobsfrom the global pool, submits them to its slave development server managed parallel requests in separate threads.

Resource-provisioning overhead is the time between issuing an HTTP request and receiving the HTTP response. Various factors beyond the underlying TCP network influence the overhead (for example, load balancing to assign a request to an application server, which includes the initialization of an instance if none exists). To measure the overhead, we sent HTTP ping requests with payloads between 0 and 2.7 Mbytes in 300-Kbyte steps, repeated 50 times for each size, and took the average. The overhead didn’t increase linearly with the payload (see Figure below) because TCP achieved higher bandwidth for larger payloads. We measured overhead in seconds; IaaS-based infrastructures, such as Amazon Elastic Compute Cloud (EC2), exhibit latencies measured in minutes.

Figure 1. Resource-provisioning overhead didn’t increase linearly with the payload because TCPacheived higher bandwidth for larger payloads.

A GAE environment can have three types of failure: an exceeded quota, offline slave applications, or loss of connectivity. To cope with such failures, the master implements a simple faulttolerance mechanism to resubmit the failed WorkJobs to the corresponding slaves using a corresponding exponential back-off time-out, depending on the failure type.
A Java virtual machine’s just-in-time (JIT) compilation converts frequently used parts of byte code to native machine code, notably improving performance. To observe JIT compilation effects, we implemented a simple Fibonacci number generator. We submitted it to GAE 50 times in sequence with a delay of one second, always using the same problem size. We set up the slave application with no instances running and measured the effective computation time in the run of each WorkJob. As we described earlier, GAE spawns instances of an application depending on its recent load (the more requests, the more instances). To mark and track instances, we used a Singleton class that contained a randomly initialized static identifier field.


One way to approximate π is through a simple Monte Carlo simulation that inscribes a circle into a square, generates p uniformly distributed random points in the square, and counts m points that lie in the circle. So, we can approximate mp4 /_ = _. We ran this algorithm on GAE. Obtaining consistent measurements from GAE is difficult for two reasons. First, the programmer has no control over the slave instances. Second, two identical consecutive requests to the same Web application could execute on completely different hardware in different locations. To minimize the bias, we repeated all experiments 10 times, eliminated outliers, and averaged all runs.
A warm-up phase was conducted for Karwendel owing to Google’s larger hardware infrastructure. Cost Analysis although we conducted all our experiments within the free daily quotas that Google offered, it was still important to estimate cost to understand the price of executing our applications in real life. So, alongside the π approximation, we implemented three algorithms with different computation and communication complexity

• matrix multiplication, based on row-wise distribution of the first matrix and full broadcast of the second;

• Mandelbrot set generation, based on the escape time algorithm; and each application to determine the queue size and eliminate JIT compilation’s effects.

A π calculation algorithm was executed first sequentially and then with an increasing number of parallel jobs by generating a corresponding number of WorkJobs in the JobFactory works pool. We chose a problem of 220 million random points, which produced a sequential execution time slightly below the 30-second limit. For each experiment, we measured and analyzed two metrics. The first was computation time, which represented the average execution time of run. The second was the average overhead, which represented the difference between the total execution time and the computation time (especially due to request latencies).

The serial execution on GAE was about two times slower than on Karwendel, owing to a slower random-number-generation routine in GAE’s standard math library. On Karwendel, transferring jobs and results incurred almost no overhead, owing to the fast local network between the master and the slaves. So, the average computation time and total execution time were almost identical until eight parallel jobs (Karwendel has eight cores). Until that point, almost linear speedup occurred. Using more than eight parallel jobs generated a load imbalance that deteriorated speedup because two jobs had to share one physical core. GAE exhibited a constant data transfer and total overhead of approximately 700 milliseconds in both cases, which explains its lower speedup. The random background load on GAE servers or on the Internet network caused the slight irregularities in execution time for different machine sizes. This classic scalability analysis method didn’t favor GAE because the 30-second limit let us execute only relatively small problems (in which Amdahl’s law limits scalability). To eliminate this barrier and evaluate GAE’s potential for computing larger problems, we used Gustafson’s law4 to increase the problem size proportionally to the machine size. We observed the impact on the execution time (which should stay constant for an ideal speedup). We distributed the jobs to 10 GAE slave applications instead of one to gain sufficient quotas (in minutes). In this case, we started with an initial problem of 180 million random points to avoid exceeding the 30second limit. (For a larger number of jobs, GAE can’t provide more resources and starts denying connections.) Again, Karwendel had a constant execution time until eight parallel jobs, demonstrating our framework’s good scalability. Starting with nine parallel jobs, the execution time steadily increased proportionally to the problem size. GAE showed similarly good scalability until 10 parallel jobs. Starting additional parallel jobs slightly increased the execution time. The overhead of aborted requests (owing to quotas being reached) caused most irregularities. For more than 17 parallel jobs, GAE had a lower execution time thanrank sort, based on each array element’s separate rank computation. This could potentially outperform other faster sequential algorithms. Experiments were ran a 100 times in sequence for each problem size and analyzed the cost of the three most limiting resources: CPU time, incoming data, and outgoing data, which we obtained through the Google application administration interface. We used the Google prices as of 10 January 2011: US$0.12 per outgoing Gbyte, $0.10 per incoming Gbyte, and $0.10 per CPU hour. We didn’t analyze the data store quota because the overall CPU hours includes its usage. As we expected, π approximation was the most computationally intensive and had almost no data-to-transfer cost. Surprisingly, rank sort consumed little bandwidth compared to CPU time, even though the full unsorted array had to transfer to the slaves and the rank of each element had to transfer back to the master. The Mandelbrot set generator was clearly dominated by the amount of image data that must transfer to the master. For π approximation, we generally could sample approximately 129 ∙ 109 random points for US$1 because the algorithm has linear computational effort. For the other algorithms, a precise estimation is more difficult because resource consumption doesn’t increase linearly with the problem size. Finally, we estimated the cost to run the same experiments on the Amazon EC2 infrastructure using EC2’s m1.small instances, which have a computational performance of one EC2 compute unit. This is equivalent to a 1.2-GHz Xeon or Opteron processor, which is similar to GAE and enables a direct comparison. We packaged the implemented algorithms into Xen-based virtual.

Figure 2: Scalability results for GAE and Karwendel for proportionality increasing machine and problem sizes. Karwendel had a constant execution time until eight parallel jobs demonstrating GAE framework’s good scalability.
Like any other system Google App engine also has some limitation. Some of its limitation is:

Initial performance has always been an issue with Google App Engine. Whenever a new instance is up for a new set of request the amount of time taken is much more and hence may give some performance issues in the beginning.

As per the above Montecarlo Simulation test it takes a lot of time for Google App Engine to run parallel jobs.


Analysis of four commercial infrastructure-as-a-service-based clouds for scientific computing showed that cloud performance is lower than that of traditional scientific computing.However, the analysis indicated that cloud computing might be a viable alternative for scientists who need resources instantly and temporarily.

1. Alexandru Iosup and his colleagues examined the long-term performance variability of Google App Engine (GAE) and Amazon Elastic Compute Cloud (EC2). The results showed yearly and daily patterns, as well as periods of stable performance.

2. The researchers concluded that GAE’s and EC2’s performance varied among different large-scale applications. Christian Vecchiola and his colleagues analyzed different cloud providers from the perspective of high-performance computing applications, emphasizing the Aneka platform-as-a-service (PaaS) framework.

3.Aneka requires a third-party deployment cloud platform and doesn’t support GAE. Windows Azure is a PaaS provider comparable to GAE but better suited for scientific problems. Jie Li and colleagues compared its performance to that of a desktop computer but performed no cost analysis.

4. MapReduce frameworks offer a different approach to cloud computation.

5. MapReduce is an orthogonal application class5 that targets large-data processing.6. It’s less suited for computationally intensive parallel algorithms8—for example, those operating on small datasets. Furthermore, it doesn’t support the implementation of more complex applications, such as recursive and nonlinear problems or scientific workflows.


1. Use of Cache for storing data rather than directly writing to the AppEngine Datastore. Then after a certain amount of time writing this data to datastore will significantly save time and cost

.2. Use of task queues to run long process in the backend. This would also reduce the activation of front end instances.

3. Use of static share objects rather than creating new object, this will alos reduce the pressure on the front end instances.

.4. Keeping resident instances up which are always up to handle any request. This will significantly increase the response time. But there should be a limit for keeping resident instances up, since it might affect the performance and also increase the cost.

An experimental approach employs the Google App Engine (GAE) for high-performance parallel computing. A generic master-slave framework enables fast prototyping and integration of parallel algorithms that are transparently scheduled and executed on the Google cloud infrastructure. Compared to Amazon Elastic Compute Cloud (EC2), GAE offers lower resource-provisioning overhead and is cheaper for jobs shorter than one hour. Experiments demonstrated good scalability of a Monte Carlo simulation algorithm. Although this approach produced important speedup, two main obstacles limited its performance: middleware overhead and resource quotas.
1. D. Sanderson, Programming Google App Engine, O’Reilly Media, 2009.

2. A. Iosup et al., “Performance Analysis of Cloud Computing Services for Many-Tasks Scientific Computing,” IEEE Trans. Parallel and Distributed Systems, vol. 22, no. 6, 2011, pp. 931–945.

3. M. Sperk, “Scientific Computing in the Cloud with Google App Engine,” master’s thesis, Faculty of Mathematics, Computer Science, and Physics, Univ. of Innsbruck, 2011; http://

4. J.L. Gustafson, “Reevaluating Amdahl’s Law,” Comm. ACM, vol. 31, no. 5, 1988, pp. 532–533

Share in:


Index Terms Google App Engine; Cloud Computing;PaaS, Karwendel iconCreate google app engine console project Sign up for Google App Engine:...

Index Terms Google App Engine; Cloud Computing;PaaS, Karwendel iconAbstract Cloud computing is a model for enabling convenient, on-demand...

Index Terms Google App Engine; Cloud Computing;PaaS, Karwendel iconAbstract As applications begin moving towards the cloud computing...

Index Terms Google App Engine; Cloud Computing;PaaS, Karwendel iconAbstract: Cloud Computing as an Internet-based computing; where resources,...

Index Terms Google App Engine; Cloud Computing;PaaS, Karwendel iconAbstract : a local business listing in Google Maps can be created...

Index Terms Google App Engine; Cloud Computing;PaaS, Karwendel iconCloud Computing and Service Virtualization

Index Terms Google App Engine; Cloud Computing;PaaS, Karwendel iconIntrusion Detection System in Cloud Computing

Index Terms Google App Engine; Cloud Computing;PaaS, Karwendel iconAmazon Cloud Computing Class Tomcat Servers

Index Terms Google App Engine; Cloud Computing;PaaS, Karwendel iconAndroid App Development Course using Google’s Latest ide “Android Studio”

Index Terms Google App Engine; Cloud Computing;PaaS, Karwendel iconSigning app Note: signing the app is only necessary when you intend...

forms and shapes

When copying material provide a link © 2017