Big data
- Abstract Dynamic job Ordering
Abstract Dynamic job Ordering
ABSTRACT
MapReduce is a popular parallel computing paradigm for large-scale data processing in clusters and data centers. A MapReduce workload generally contains a set of jobs, each of which consists of multiple map tasks followed by multiple reduce tasks. Due to 1) that map tasks can only run in map slots and reduce tasks can only run in reduce slots, and 2) the general execution constraints that map tasks are executed before reduce tasks, different job execution orders and map/reduce slot configurations for a MapReduce workload have significantly different performance and system utilization. This paper proposes two classes of algorithms to minimize the makespan and the total completion time for an offline MapReduce workload. Our first class of algorithms focuses on the job ordering optimization for a MapReduce workload under a given map/reduce slot configuration. In contrast, our second class of algorithms considers the scenario that we can perform optimization for map/reduce slot configuration for a MapReduce workload. We perform simulations as well as experiments on Amazon EC2 and show that our proposed algorithms produce results that are up to 15 _ 80 percent better than currently unoptimized Hadoop, leading to significant reductions in running time in practice.
EXISTING SYSTEM
A MapReduce job consists of a set of map and reduce tasks, where reduce tasks are performed after the map tasks. Hadoop, an open source implementation of MapReduce, has been deployed in large clusters containing thousands of machines by companies such as Amazon and Facebook. In those cluster and data center environments, MapReduce and Hadoop are used to support batch processing for jobs submitted from multiple users (i.e., MapReduce workloads). Despite many research efforts devoted to improve the performance of a single MapReduce job, there is relatively little attention paid to the system performance of MapReduce workloads. Therefore, this paper tries to improve the performance of MapReduce workloads.
Disadvantages of Existing System:
1.The previous works all focused on the single-stage parallelism, where each job only has a single stage.
2.Slow performance of the Map Reducer workloads
PROPOSED SYSTEM
In this paper, we target at one subset of production MapReduce workloads that consist of a set of independent jobs (e.g., each of jobs processes distinct data sets with no dependency between each other) with different approaches. For dependent jobs (i.e., MapReduce workflow), one MapReduce can only start only when its previous dependent jobs finish the computation subject to the input-output data dependency. In contrast, for independent jobs, there is an overlap computation between two jobs, i.e., when the current job completes its map-phase computation and starts its reduce-phase computation, the next job can begin to perform its map-phase computation in a pipeline processing mode by possessing the released map slots from its previous job.
Advantages of Proposed System:
1.We propose a bi-criteria heuristic algorithm to optimize makespan and total completion time simultaneously.
2.Propose slot configuration algorithms for makespan and total completion time. We also show that there is a proportional feature for them, which is very important and can be used to address the time efficiency problem of proposed enumeration algorithms for a large size of total slots.
MODULES
1.Job Ordering Optimization Module
2.Slot Configuration Optimization Module
Module Description:
Job Ordering Optimization:
This module we can significantly improve the total completion time of MapReduce. For Map Reducer workloads we use the job ordering optimization algorithms MK_JR and MK_TCT_JR.
Slot Configuration Optimization:
The slot configuration can have a significant impact on performance for MapReduce workloads. We propose several enumeration algorithms for map/reduce slot configuration optimization with regard to the makespan and total completion time of a MapReduce workload.
SYSTEM REQUIREMENTS
Hardware Requirements:
· Processor - Pentium –IV
· Speed - 1.1 Ghz
· Ram - 256 Mb
· Hard Disk - 20 Gb
· Key Board - Standard Windows Keyboard
· Mouse - Two or Three Button Mouse
· Monitor - SVGA
Software Requirements:
· Operating System : Windows XP
· Coding Language : Java
- Traffic Aware abstract
Traffic Aware abstract
ABSTRACT
—The MapReduce programming model simplifies large-scale data processing on commodity cluster by exploiting parallel map tasks and reduce tasks. Although many efforts have been made to improve the performance of MapReduce jobs, they ignore the network traffic generated in the shuffle phase, which plays a critical role in performance enhancement. Traditionally, a hash function is used to partition intermediate data among reduce tasks, which, however, is not traffic-efficient because network topology and data size associated with each key are not taken into consideration. In this paper, we study to reduce network traffic cost for a MapReduce job by designing a novel intermediate data partition scheme. Furthermore, we jointly consider the aggregator placement problem, where each aggregator can reduce merged traffic from multiple map tasks. A decomposition-based distributed algorithm is proposed to deal
with the large-scale optimization problem for big data application and an online algorithm is also designed to adjust data partition and aggregation in a dynamic manner. Finally, extensive simulation results demonstrate that our proposals can significantly reduce network
traffic cost under both offline and online cases.EXISISTING SYSTEM:
Intermediate data are shuffled according to a hash function in Hadoop, which would lead to large network traffic because it ignores network topology and data size associated with each key. To tackle this problem incurred by the traffic-oblivious partition scheme, we take into account of both task locations and data size associated with each key in this paper. By assigning keys with larger data size to reduce tasks closer to map tasks, network traffic can be significantly
reduced.To further reduce network traffic within a MapReduce job, we consider to aggregate data with the same keys before sending them to remote reduce tasks. Although a similar function, called combiner, has been already adopted by Hadoop, it operates immediately after a map task solely for its generated data, failing to exploit the data aggregation opportunities among multiple tasks on different machines.
Disadvantages:
1.Traditionally, A hash function is used to partition intermediate data among reduce tasks, which, however, is not traffic-efficient because network topology and data size associated with each key are not taken into consideration.
2.It leads to large network traffic because it ignores network topology and data size associated with each key.
3.Network traffic can be significantly reduced.
PROPOSED SYSTEM:
In this paper, we jointly consider data partition and aggregation for a Map Reduce job with an objective that is to minimize the total network traffic. In particular, we propose a distributed algorithm for big data applications by decomposing the original large-scale problem into several sub problems that can be solved in parallel. Moreover, an online algorithm is designed to deal with the data partition and aggregation in a dynamic manner. Finally, extensive simulation results demonstrate that our proposals can significantly reduce network traffic cost in both offline and online cases.
Advantages:
1.Each aggregator can reduce merged traffic from multiple map tasks. It is designed to adjust data partition and aggregation in a dynamic manner.
2.It can significantly reduce network traffic cost in both offline and online cases.
ARCHITECTURE:

Fig. 1. Two MapReduce partition schemes.
SYSTEM REQUIREMENTS:
HARDWARE REQUIREMENTS:
System : Pentium IV 2.4 GHz.
Hard Disk : 40 GB.
Floppy Drive : 1.44 Mb.
Monitor : 15 VGA Colour.
Mouse : Logitech.
Ram : 512 Mb.
SOFTWARE REQUIREMENTS:
Operating system : Windows XP/7.
Coding Language : JAVA
Frontend : AWT, Swings
Backend : MySQL
Tools : Cygwin
- Abstract Hybrid Job
×Abstract Hybrid Job
ABSTRACT
It is cost-efficient for a tenant with a limited budget to establish a virtual MapReduce cluster by renting multiple virtual private servers (VPSs) from a VPS provider. To provide an appropriate scheduling scheme for this type of computing environment, we propose in this paper a hybrid job-driven scheduling scheme (JoSS for short) from a tenant’s perspective. JoSS provides not only joblevel scheduling, but also map-task level scheduling and reduce-task level scheduling. JoSS classifies MapReduce jobs based on job scale and job type and designs an appropriate scheduling policy to schedule each class of jobs. The goal is to improve data locality for both map tasks and reduce tasks, avoid job starvation, and improve job execution performance. Two variations of JoSS are further introduced to separately achieve a better map-data locality and a faster task assignment. We conduct extensive experiments to evaluate and compare the two variations with current scheduling algorithms supported by Hadoop. The results show that the two variations outperform the other tested algorithms in terms of map-data locality, reduce-data locality, and network overhead without incurring significant overhead. In addition, the two variations are separately suitable for different MapReduce-workload scenarios and provide the best job performance among all tested algorithms.
EXISTING SYSTEM
Typically, a MapReduce cluster consists of a set of commodity machines/nodes located on several racks and interconnected with each other in a local area network (LAN). In this paper, we call this a conventional MapReduce cluster. Due to the fact that building and maintaining a conventional MapReduce cluster is costly for a person/organization with a limited budget, an alternative way is to establish a virtual MapReduce cluster by either renting a MapReduce framework from a MapReduce service provider or renting multiple virtual private servers (VPSs) from a VPS provider. Each VPS is a virtual machine with its own operating system and disk space. Due to some reasons, such as availability issue of a datacenter or resource shortage on a popular datacenter, a tenant might rent VPSs from different datacenters operated by a same VPS provider to establish his/her virtual MapReduce cluster.
Disadvantages of Existing System:
1.Job Starvation
2.MapReduce cluster is costly for a person/organization with a limited budget
PROPOSED SYSTEM
In order to provide an appropriate scheduling scheme for a tenant to achieve a high map-and-reduce data locality and improve job performance in his/her virtual MapReduce cluster, in this paper we propose a hybrid job-driven scheduling scheme (JoSS for short) by providing scheduling in three levels: job, map task, and reduce task. JoSS classifies MapReduce jobs into either large or small jobs based on each job’s input size to the average datacenter scale of the virtual MapReduce cluster, and further classifies small MapReduce jobs into either map-heavy or reduce-heavy based on the ratio between each job’s reduce-input size and the job’s map-input size. Then JoSS uses a particular scheduling policy to schedule each class of jobs such that the corresponding network traffic generated during job execution (especially for inter-datacenter traffic) can be reduced, and the corresponding job performance can be improved. In addition, we propose two variations of JoSS, named JoSS-T and JoSS-J, to guarantee a fast task assignment and to further increase the VPS-locality, respectively.
Advantages of Proposed System:
1.JoSS avoids job starvation and improves job performance
2.A formal proof is also provided to determine the best threshold for classifying MapReduce jobs.
MODULES
1.Job Classification Module
2.Scheduling Policies Module
Module Description:
Job Classification:
In this module, based on the ratio of predefined block size of reduce and map jab, job classification can be classified into either a reduce-heavy job or a map-heavy job.
Scheduling Policies:
In this module, based on Job classification, JoSS used three types of policies. Those are;
i. Policy A:
This policy is designed for a small Reduce-Heavy (RH) job
ii. Policy B:
This policy is designed for a small Map-Heavy (MH) job
iii. Policy C:
This policy is designed for a large job
SYSTEM REQUIREMENTS
Hardware Requirements:
- Processor - Pentium –IV
- Speed - 1.1 Ghz
- Ram - 256 Mb
- Hard Disk - 20 Gb
- Key Board - Standard Windows Keyboard
- Mouse - Two or Three Button Mouse
- Monitor - SVG
Software Requirements:
- Operating System : Windows XP
- oding Language : Java
- View all
- Traffic Aware abstract
Cloud computing
- Abstract-A Secure Anti-Collusion Data
×
A Secure Anti-Collusion Data Sharing Scheme for Dynamic Groups in the Cloud
ABSTRACT
Benefited from cloud computing, users can achieve an effective and economical approach for data sharing among group members in the cloud with the characters of low maintenance and little management cost. Meanwhile, we must provide security guarantees for the sharing data files since they are outsourced. Unfortunately, because of the frequent change of the membership, sharing data while providing privacy-preserving is still a challenging issue, especially for an untrusted cloud due to the collusion attack. Moreover, for existing schemes, the security of key distribution is based on the secure communication channel, however, to have such channel is a strong assumption and is difficult for practice. In this paper, we propose a secure data sharing scheme for dynamic members. First, we propose a secure way for key distribution without any secure communication channels, and the users can securely obtain their private keys from group manager. Second, our scheme can achieve fine-grained access control, any user in the group can use the source in the cloud and revoked users cannot access the cloud again after they are revoked. Third, we can protect the scheme from collusion attack, which means that revoked users cannot get the original data file even if they conspire with the untrusted cloud. In our approach, by leveraging polynomial function, we can achieve a secure user revocation scheme. Finally, our scheme can achieve fine efficiency, which means previous users need not to update their private keys for the situation either a new user joins in the group or a user is revoked from the group.
EXISTING SYSTEM
Kallahalla et al presented a cryptographic storage system that enables secure data sharing on untrustworthy servers based on the techniques that dividing files into file groups and encrypting each file group with a file-block key. Yu et al exploited and combined techniques of key policy attribute-based encryption, proxy re-encryption and lazy re-encryption to achieve fine-grained data access control without disclosing data contents.
Disadvantages of Existing System:
1. The file-block keys need to be updated and distributed for a user revocation; therefore, the system had a heavy key distribution overhead.
2. The complexities of user participation and revocation in these schemes are linearly increasing with the number of data owners and the revoked users.
3. The single-owner manner may hinder the implementation of applications, where any member in the group can use the cloud service to store and share data files with others.
PROPOSED SYSTEM
In this paper, we propose a secure data sharing scheme, which can achieve secure key distribution and data sharing for dynamic group. We provide a secure way for key distribution without any secure communication channels. The users can securely obtain their private keys from group manager without any Certificate Authorities due to the verification for the public key of the user. Our scheme can achieve fine-grained access control, with the help of the group user list, any user in the group can use the source in the cloud and revoked users cannot access the cloud again after they are revoked. We propose a secure data sharing scheme which can be protected from collusion attack. The revoked users can not be able to get the original data files once they are revoked even if they conspire with the untrusted cloud. Our scheme can achieve secure user revocation with the help of polynomial function. Our scheme is able to support dynamic groups efficiently, when a new user joins in the group or a user is revoked from the group, the private keys of the other users do not need to be recomputed and updated. We provide security analysis to prove the security of our scheme.
Advantages of Proposed System:
1. The computation cost is irrelevant to the number of revoked users in RBAC scheme. The reason is that no matter how many users are revoked, the operations for members to decrypt the data files almost remain the same.
2. The cost is irrelevant to the number of the revoked users. The reason is that the computation cost of the cloud for file upload in our scheme consists of two verifications for signature, which is irrelevant to the number of the revoked users. The reason for the small computation cost of the cloud in the phase of file upload in RBAC scheme is that the verifications between communication entities are not concerned in this scheme.
3. In our scheme, the users can securely obtain their private keys from group manager Certificate Authorities and secure communication channels. Also, our scheme is able to support dynamic groups efficiently, when a new user joins in the group or a user is revoked from the group, the private keys of the other users do not need to be recomputed and updated.
SYSTEM ARCHITECTURE

MODULES
In this implementation we have 3 main modules,
1. Cloud Module
2. Data Manager Module
3. Group Member Module
Module Description:
Cloud:
The cloud, maintained by the cloud service providers, provides storage space for hosting data files in a pay-as-you-go manner. However, the cloud is untrusted since the cloud service providers are easily to become untrusted. Therefore, the cloud will try to learn the content of the stored data.
Data Manager:
Group manager takes charge of system parameters generation, user registration, and user revocation. In the practical applications, the group manager usually is the leader of the group. Therefore, we assume that the group manager is fully trusted by the other parties.
Data Member:
Group members (users) are a set of registered users that will store their own data into the cloud and share them with others. In the scheme, the group membership is dynamically changed, due to the new user registration and user revocation.
SYSTEM CONFIGURATION
Hardware Configuration
· Processor - Pentium –IV
· Speed - 1.1 Ghz
· RAM - 256 MB(min)
· Hard Disk - 20 GB
· Key Board - Standard Windows Keyboard
· Mouse - Two or Three Button Mouse
· Monitor - SVGA
Software Configuration
· Operating System : Windows XP
· Programming Language : JAVA
- Towards Optimized Fine-Grained Pricing
of IaaS Cloud Platform
×
Towards Optimized Fine-Grained Pricing of IaaS Cloud Platform
ABSTRACT
This system aims to investigate an optimized fine-grained and fair pricing scheme. Two tough issues are addressed: (1) the profits of resource providers and customers often contradict mutually; (2) VM-maintenance overhead like startup cost is often too huge to be neglected. Not only can we derive an optimal price in the acceptable price range that satisfies both customers and providers simultaneously, but we also find a best-fit billing cycle to maximize social welfare (i.e., the sum of the cost reductions for all customers and the revenue gained by the provider). We carefully evaluate the proposed optimized fine-grained pricing scheme with two large-scale real-world production traces (one from Grid Workload Archive and the other from Google data center). We compare the new scheme to classic coarse-grained hourly pricing scheme in experiments and find that customers and providers can both benefit from our new approach.
Existing System
Nowadays, once customers terminated the instances, some IaaS providers naturally take it for granted that they can get to reuse the resources immediately even if customers still charge the whole time period. It might be potential illegal because a seller cannot sell a single item to two customers, which is a violation in economics. And also this is unfair to the customers. A few other IaaS providers are trying to solve the partial usage waste issue by offering optional fine grained pricing schemes.
Disadvantages of Existing System:
1. In the pay-as-you-go cloud pricing, short-job users have to pay more than what they actual use.
2. Data usage wastage more.
3. Cloud providers cannot give the services to more users.
Proposed System
Compared to the existing system, in proposed system we have many advantages in adopting our optimized fine grained pricing scheme: (1) our fine-grained pricing scheme is fairly flexible to suit various types of services and sharp demands raised by users, unlike the coarse-grained hourly pricing scheme; (2) our fine grained pricing scheme can effectively reduce the partial usage waste because the idle instance time can be allocated to more customers especially in a competitive situation; (3) Users will feel more satisfied due to the more precise computation of the payment cost, such that more users will join the cloud and resource providers can also benefit in turn.
Advantages of proposed system:
1. In proposed we reduced data usage waste.
2. Short-job users are satisfied in payment cost.
3. Optimal price point can satisfy both users and providers with maximized total utility.
MODULES
1. Resource Bundle Module
2. Time Granularity Module
3. Unit Price Module
Module Description:
Resource Bundle:
Resource bundle is not an instance but just some type of resource like CPU or RAM. The resource bundle module serves as a kind of container to execute task workloads based on user demands.
Time Granularity:
The time granularity is defined as the minimum length of time in pricing the rented resources. This means how much time takes to use resources by the user.
Unit Price:
The unit price module specifies how much cost the user needs to pay per time granularity for the resource consumption.
SYSTEM REQUIREMENTS
Hardware Requirements:
· Processor - Pentium –IV
· Speed - 1.1 Ghz
· Ram - 256 Mb
· Hard Disk - 20 Gb
· Key Board - Standard Windows Keyboard
· Mouse - Two or Three Button Mouse
· Monitor - SVGA
Software Requirements:
· Operating System : Windows XP
· Coding Language : Java
· Front end : Swings & AWT
- Abstract-Two-Factor Data Security
×
Two-Factor Data Security Protection Mechanism for Cloud Storage System
ABSTRACT
In this paper, we propose a two-factor data security protection mechanism with factor revocability for cloud storage system. Our system allows a sender to send an encrypted message to a receiver through a cloud storage server. The sender only needs to know the identity of the receiver but no other information (such as its public key or its certificate). The receiver needs to possess two things in order to decrypt the ciphertext. The first thing is his/her secret key stored in the computer. The second thing is a unique personal security device which connects to the computer. It is impossible to decrypt the ciphertext without either piece. More importantly, once the security device is stolen or lost, this device is revoked. It cannot be used to decrypt any ciphertext. This can be done by the cloud server which will immediately execute some algorithms to change the existing ciphertext to be un-decryptable by this device. This process is completely transparent to the sender. Furthermore, the cloud server cannot decrypt any ciphertext at any time. The security and efficiency analysis show that our system is not only secure but also practical.
EXISTING SYSTEM
There exists cryptographic primitive called “leakage-resilient encryption”. The security of the scheme is still guaranteed if the leakage of the secret key is up to certain bits such that the knowledge of these bits does not help to recover the whole secret key. However, though using leakage resilient primitive can safeguard the leakage of certain bits, there exists another practical limitation. Suppose we put part of the secret key into the security device. Unfortunately the device is stolen. The user needs to obtain a replacement device so that he can continue to decrypt his corresponding secret key. The trivial way is to copy the same bits (as in the stolen device) to the new device by the private key generator (PKG). This approach can be easily achieved. Nevertheless, there exists security risk. If the adversary (who has stolen the security device) can also break into the computer where the other part of secret key is stored, then it can decrypt all ciphertext corresponding to the victim user. The most secure way is to cease the validity of the stolen security device.
Disadvantages of Existing System:
1. If the user has lost his security device, then his/ her corresponding ciphertext in the cloud cannot be decrypted forever! That is, the approach cannot support security device update/revocability.
2. The sender needs to know the serial number/ public key of the security device, in additional to the user’s identity/public key. That makes the encryption process more complicated.
PROPOSED SYSTEM
In this paper, we propose a novel two-factor security protection mechanism for data stored in the cloud. Our mechanism provides the following nice features: 1) Our system is an IBE (Identity-based encryption)- based mechanism. That is, the sender only needs to know the identity of the receiver in order to send an encrypted data (ciphertext) to him/her. No other information of the receiver (e.g., public key, certificate etc.) is required. Then the sender sends the ciphertext to the cloud where the receiver can download it at anytime. 2) Our system provides two-factor data encryption protection. In order to decrypt the data stored in the cloud, the user needs to possess two things. First, the user needs to have his/her secret key which is stored in the computer. Second, the user needs to have a unique personal security device which will be used to connect to the computer (e.g., USB, Bluetooth and NFC). It is impossible to decrypt the ciphertext without either piece. 3) More importantly, our system, for the first time, provides security device (one of the factors) revocability. Once the security device is stolen or reported as lost, this device isrevoked. That is, using this device can no longer decrypt any ciphertext (corresponding to the user) in any circumstance. The cloud will immediately execute some algorithms to change the existing ciphertext to beun-decryptableby this device. While, the user needs to use his new/replacement device (together with his secret key) to decrypt his/her ciphertext; this process is completely transparent to the sender.
Advantages of Proposed System:
1. Our solution not only enhances the confidentiality of the data, but also offers the revocability of the device so that once the device is revoked; the corresponding ciphertext will be updated automatically by the cloud server without any notice of the data owner.
2. The cloud server cannot decrypt any ciphertext at any time
SYSTEM ARCHITECTURE

Fig1 (a): Ordinary Data Sharing

Fig1 (b): Update ciphertext after issuing a new security device
Fig1: Framework
MODULES
In this implementation we have 5 Modules,
1. Private Key Generator
2. Security Device Issuer
3. Sender Module
4. Receiver Module
5. Cloud Server Module
Module Description:
Private Key Generator:
It is a trusted party responsible for issuing private key of every user.
Security Device Issuer (SDI):
It is a trusted party responsible for issuing security device of every user.
Sender:
She is the sender (and the creator) of the ciphertext. She only knows the identity (e.g., email address) of the receiver but nothing else related to the receiver. After she has created the ciphertext, she sends to the cloud server to let the receiver for download.
Receiver:
He is the receiver of the ciphertext and has a unique identity (e.g., email address). The ciphertext is stored on cloud storage while he can download it for decryption. He has a private key (stored in his computer) and a security device (that contains some secret information related to his identity). They are given by the PKG. The decryption of ciphertext requires both the private key and the security device.
Cloud server:
The cloud server is responsible for storing all ciphertext (for receiver to download). Once a user has reported lost of his security device (and has obtained a new one from the PKG), the cloud acts as a proxy to re-encrypt all his past and future ciphertext corresponding to the new device. That is, the old device is revoked.
SYSTEM CONFIGURATION
Hardware Configuration
· Processor - Pentium –IV
· Speed - 1.1 Ghz
· RAM - 256 MB(min)
· Hard Disk - 20 GB
· Key Board - Standard Windows Keyboard
· Mouse - Two or Three Button Mouse
· Monitor - SVGA
Software Configuration
· Operating System : Windows XP
· Programming Language : JAVA
- view all
DATA MINING
- Abstract a Simple
×
A Simple Message-Optimal Algorithm for
Random sampling from a Distributed streamAbstract
We present a simple, message-optimal algorithm for maintaining a random sample from a large data stream whose input elements are distributed across multiple sites that communicate via a central coordinator. At any point in time, the set of elements held by the coordinator represent a uniform random sample from the set of all the elements observed so far. When compared with prior work, our algorithms asymptotically improve the total number of messages sent in the system. We present a matching lower bound, showing that our protocol sends the optimal number of messages up to a constant factor with large probability. We also consider the important case when the distribution of elements across different sites is non-uniform, and show that for such inputs, our algorithm significantly outperforms prior solutions.
Existing System
A fundamental problem in this setting is to obtain a random sample drawn from the union of all distributed streams. This generalizes the classic reservoir sampling problem, to the setting of multiple distributed streams, and has applications to approximate query answering, selectivity
estimation, and query planning. For example, in the case of network routers, maintaining a random sample from the union of the streams is valuable for network monitoring tasks involving the detection of global properties. Other problems on distributed stream processing, including the estimation of the number of distinct elements and heavy hitters use random sampling as a primitive (we note, though, that better solutions for the heavy hitters problem in terms of the accuracy parameter may be possible than those provided by random sampling). Distributed random sampling is already used in current day “big data” systems such as Blink DB, which use stored random samples to process queries quickly, in exchange for relaxed accuracy guarantees. These systems operate on tens of terabytes of data, spread over hundreds of machines, and have shown dramatic speedups for common aggregate queries, using sampling.Disadvantages:
1. The distributed random sampling problem that we consider is as follows. There are k distributed sites, numbered 1through k, in addition to a coordinator.
2. To generalize the classic reservoir sampling problem (where the algorithm is attributed to Waterman) to the setting of multiple distributed streams, and has applications to approximate query answering, selectivity estimation, and query planning.
Proposed System
We presented a simple message-optimal algorithm for maintaining a uniform random sample, with or without replacement, from a distributed stream. Our main contributions are a simple algorithm for sampling without replacement from a distributed stream, as well as a matching lower bound showing that the message complexity of our algorithm is optimal. The message complexity is the number of message transmission between sites and the coordinator. Advantages
1. The algorithm is easy to implement, and as our experiments show, has very good practical performance.
Configuration:-
H/W System Configuration:-
Processor - Pentium –III
Speed - 1.1 Ghz
RAM - 256 MB (min)
Hard Disk - 20 GB
Floppy Drive - 1.44 MB
Key Board - Standard Windows Keyboard
Mouse - Two or Three Button Mouse
Monitor - SVGA
S/W System Configuration:-
Operating System : Windows95/98/2000/XP
Programming Language : Java
- Abstract clustering
×Abstract clustering
Abstract
As more and more applications produce streaming data, clustering data streams has become an important technique for data and knowledge engineering. A typical approach is to summarize the data stream in real-time with an online process into a large number of so called micro-clusters. Micro-clusters represent local density estimates by aggregating the information of many data points in a defined area. On demand, a (modified) conventional clustering algorithm is used in a second offline step to re-cluster the micro clusters into larger final clusters. For re-clustering, the centers of the micro-clusters are used as pseudo points with the density estimates used as their weights. However, information about density in the area between micro-clusters is not preserved in the online process and re-clustering is based on possibly inaccurate assumptions about the distribution of data within and between micro-clusters (e.g., uniform or Gaussian). This paper describes DBSTREAM, the first micro-cluster-based online clustering component that explicitly captures the density between micro-clusters via a shared density graph. The density information in this graph is then exploited for re-clustering based on actual density between adjacent micro-clusters. We discuss the space and time complexity of maintaining the shared density graph. Experiments on a wide range of synthetic and real data sets highlight that using shared density improves clustering quality over other popular data stream clustering methods which require the creation of a larger number of smaller micro clusters to achieve comparable results.
Existing System
Data stream clustering is typically done as a two-stage process with an online part which summarizes the data into many micro-clusters or grid cells and then, in an offline process, these micro-clusters (cells) are re-clustered/merged into a smaller number of final clusters. Since the re-clustering is an offline process and thus not time critical, it is typically not discussed in detail in papers about new data stream clustering algorithms. Most papers suggest using an (sometimes slightly modified) existing conventional clustering algorithm (e.g., weighted k-means in CluStream) where the micro-clusters are used as pseudo points. Another approach used in Den Stream is to use reach ability where all micro-clusters which are less than a given distance from each other are linked together to form clusters. Grid-based algorithms typically merge adjacent dense grid cells to form larger clusters (see, e.g., the original version of D-Stream and MR-Stream).
Disadvantages:
1. The number of clusters varies over time for some of the datasets. This needs to be considered when comparing to CluStream, which uses a fixed number of clusters.
Proposed System
We develop and evaluate a new method to address this problem for micro-cluster-based algorithms. We introduce the concept of a shared density graph which explicitly captures the density of the original data between micro-clusters during clustering and then show how the
graph can be used for re-clustering micro-clusters. This is a novel approach since instead on relying on assumptions about the distribution of data points assigned to a micro cluster (MC) (often a Gaussian distribution around a center), it estimates the density in the shared region between micro-clusters directly from the data. To the best of our knowledge, this paper is the first to propose and investigate using a shared-density-based re-clustering approach for
data stream clustering.Advantages:
1. This is an important advantage since it implies that we can tune the online component to produce less micro-cluster for shared-density re-clustering.
2. It improves performance and, in many cases, the saved memory more than offset the
memory requirement for the shared density graph.System Architecture

Fig. 2 MC1 is a single MC. MC2 and MC3 are close to each other but the density between them is low relative to the two MCs densities while MC3 and MC4 are connected by a high density area.
Modules
1. Leader-Based Clustering
2. Capturing Shared Density
3. Micro-Cluster Connectivity
4. Noise Clusters
Module Description
1. Leader-Based Clustering
DBSTREAM represents each MC by a leader (a data point defining the MC’s center) and the density in an area of a user-specified radius r (threshold) around the center. This is similar to DBSCAN’s concept of counting the points is an eps-neighborhood.
2. Capturing Shared Density
The fact, that in dense areas MCs will have an overlapping assignment area, can be used to measure density between MCs by counting the points which are assigned to two or more MCs.
3. Micro-Cluster Connectivity
Less dense clusters will also have a lower shared density. To detect clusters of different density correctly, we need to define connectivity relative to the densities (weights) of the participating clusters.
4. Noise Clusters
To remove noisy MCs from the final clustering, we have to detect these MCs. Noisy clusters are typically characterized as having low density represented by a small weight.
Configuration:-
H/W System Configuration:-
Processor - Pentium –III
Speed - 1.1 Ghz
RAM - 256 MB(min)
Hard Disk - 20 GB
Floppy Drive - 1.44 MB
Key Board - Standard Windows Keyboard
Mouse - Two or Three Button Mouse
Monitor - SVGA
S/W System Configuration:-
v Operating System :Windows95/98/2000/XP
v Programming Language : Java
- Abstract booster
×Abstract booster
Abstract
Classification problems in high dimensional data with a small number of observations are becoming more common especially in microarray data. During the last two decades, lots of efficient classification models and feature selection (FS) algorithms have been proposed for higher prediction accuracies. However, the result of an FS algorithm based on the prediction accuracy will be unstable over the variations in the training set, especially in high dimensional data. This paper proposes a new evaluation measure Q-statistic that incorporates the stability of the selected feature subset in addition to the prediction accuracy. Then, we propose the
Booster of an FS algorithm that boosts the value of the Q-statistic of the algorithm applied. Empirical studies based on synthetic data and 14 microarray data sets show that Booster boosts not only the value of the Q-statistic but also the prediction accuracy of the algorithm applied unless the data set is intrinsically difficult to predict with the given algorithm.Existing System
Methods used in the problems of statistical variable selection such as forward selection, backward elimination and their combination can be used for FS problems. Most of the
successful FS algorithms in high dimensional problems have utilized forward selection method but not considered backward elimination method since it is impractical to implement backward elimination process with huge number of features. A serious intrinsic problem with forward selection is, however, a flip in the decision of the initial feature may lead to a completely different feature subset and hence the stability of the selected feature set will be very low although the selection may yield very high accuracy. This is known as the stability problem in FS. The research in this area is relatively a new field and devising an efficient method to obtain a more stable feature subset with high accuracy is a challenging area of research.Disadvantages:
1. Several studies based on re-sampling technique have been done to generate different data sets for classification problem, and some of the studies utilize re-sampling on the feature space.
Proposed System
This paper proposes Q-statistic to evaluate the performance of an FS algorithm with a classifier. This is a hybrid measure of the prediction accuracy of the classifier and the stability of the selected features. Then the paper proposes Booster on the selection of feature subset from a
given FS algorithm. The basic idea of Booster is to obtain several data sets from original data set by re-sampling on sample space. Then FS algorithm is applied to each of these re-sampled data sets to obtain different feature subsets. The union of these selected subsets will be the feature subset obtained by the Booster of FS algorithm. Empirical studies show that the Booster of an algorithm boosts not only the value of Q-statistic but also the prediction accuracy of the classifier applied.Advantages:
1. The prediction accuracy of classification without consideration on the stability of the
selected feature subset.2. The MI estimation with numerical data involves density estimation of high dimensional data.
Modules
1. Efficiency of Booster
2. Booster Boost s Accuracy
3. Booster Boosts Q-Statistic
Module Description
1. Efficiency of Booster
It presents the effect of s-Booster on accuracy and Q-statistic against the original s’s. Classifier used here is NB
2. Booster Boost s Accuracy
mRMR-Booster is more efficient in boosting the accuracy of the original mRMR when it gives low accuracies.
3. Booster Boosts Q-Statistic
FCBF gives poor performance on Q-statistic in contrast to its high performance on accuracy. Booster improves the Q-statistic for all the cases considered except the case
with the data set.Configuration:-
H/W System Configuration:-
Processor - Pentium –III
Speed - 1.1 Ghz
RAM - 256 MB(min)
Hard Disk - 20 GB
Floppy Drive - 1.44 MB
Key Board - Standard Windows Keyboard
Mouse - Two or Three Button Mouse
Monitor - SVGA
S/W System Configuration:-
v Operating System :Windows95/98/2000/XP
v Programming Language : Java
- View All
- Abstract clustering
Mobile Computing
- Privacy-Aware High-Quality Map Generation with Participatory Sensing
×
Privacy-Aware High-Quality Map Generation with Participatory Sensing
ABSTRACT
Accurate maps are increasingly important with the growth of smart phones and the development of location-based services. Several crowdsourcing based map generation protocols that rely on users to provide their traces have been proposed. Being creative, however, those methods pose a significant threat to user privacy as the traces can easily imply user behavior patterns. On the flip side, crowdsourcing-based map generation method does need individual locations. To address the issue, we present a systematic participatory-sensing-based high-quality map generation scheme, PMG that meets the privacy demand of individual users. To be specific, the individual users merely need to upload unorganized sparse location points to reduce the risk of exposing users’ traces and utilize the Crust, a technique from computational geometry for curve reconstruction, to estimate the unobserved map as well as evaluate the degree of privacy leakage. Experiments show that our solution is able to generate high-quality maps for a real environment that is robust to noisy data. The difference between the ground-truth map and the produced map is less than 10 m, even when the collected locations are about 32 m apart after clustering for the purpose of removing noise.
EXISTING SYSTEM
Currently, digital maps based on the satellite images and street level information is widely used. But they cannot precisely reflect the most up-to-date ground information, especially in the developing countries, when cities are often under constructions and renovations, the integrated maps are likely to be far behind the current state. To reflect the map dynamics accurately and effectively, several techniques have been proposed recently, among which participatory sensing attracts the most attention. Individual users contribute their trace information (with GPS data) to a central map generation server. While guaranteeing high quality of map information, the existing methods have various limitations.
Disadvantages of Existing System:
1. Energy inefficiency
2. Privacy leakage
PROPOSED SYSTEM
In this study, we design a privacy-aware map generation scheme, PMG. Unlike the existing methods, in our scheme, each user selectively chooses, reshuffles, and uploads a few locations from their traces, instead of the entire traces. After receiving those unorganized points from a group of users, the server generates the final map. To provide high-quality map generation service, meanwhile preserving the privacy for each user, there are three major challenges we need to address: 1) quantifying the privacy leakage of data points provided by individual users; 2) generating theoretically-proven map using the reported unorganized points cloud; 3) designing map generation scheme that is robust to various discrepancies such as GPS error.
Advantages of Proposed System:
1. We generate an accurate and reliable map while avoid leaking the users’ privacy
SYSTEM ARCHITECTURE

MODULES
1. User Module
2. Server Module
3. Map Generation Module
4. Quality Assessment Module
Module Description:
User:
In this module, the users serve as the GPS location provider. To provide certain diversity of uploaded data, one finite local buffer is used to record the user’s trace. One data report engine, called Location Selection, would be activated by the location query from the remote server. Once receiving such request packet, the users will look-up their corresponding local buffer and reply the server with the locations that match the request condition.
Server:
The essential function of the server is to provide high-quality map generation service based on the collected unorganized GPS locations from various users. To guarantee the estimated map quality, all chosen GPS locations will firstly enter into one data pre-processing block to remove all unjustified data.
Map Generation:
This module performed at server side and valid data comes from the server into this map generation module. This module implemented by the crust algorithm.
Quality Assessment:
This module examines the quality of current generated map (i.e., the output of Crust). When the predefined map quality metric is not met, the block is further scheduled to estimate the optimal cell that will provide maximal gains in estimating the original map; server will broadcast this cell via request packet to actively pull the useful information.
SYSTEM REQUIREMENTS:
Hardware Requirements:
Ø System : Pentium IV 2.4 GHz.
Ø Hard Disk : 40 GB.
Ø Floppy Drive : 1.44 Mb.
Ø Monitor : 15 VGA Colour.
Ø Mouse : Logitech.
Ø Ram : 512 Mb.
Software Requirements:
Ø Operating system : Windows XP/7.
Ø Coding Language : JAVA
- Hybrid Job-Driven Scheduling for Virtual Map Reduce Clusters
×
Hybrid Job-Driven Scheduling for Virtual Map Reduce Clusters
Abstract:
It is cost-efficient for a tenant with a limited budget to establish a virtual Map Reduce cluster by renting multiple virtual private servers (VPSs) from a VPS provider. To provide an appropriate scheduling scheme for this type of computing environment, we propose in this paper a hybrid job-driven scheduling scheme (JoSS for short) from a tenant’s perspective. JoSS provides not only job level scheduling, but also map-task level scheduling and reduce-task level scheduling. JoSS classifies Map Reduce jobs based on job scale and job type and designs an appropriate scheduling policy to schedule each class of jobs. The goal is to improve data locality for both map tasks and reduce tasks, avoid job starvation, and improve job execution performance. Two variations of JoSS are further introduced to separately achieve a better map-data locality and a faster task assignment. We conduct extensive experiments to evaluate and compare the two variations with current scheduling algorithms supported by Hadoop. The results show that the two variations outperform the other tested algorithms in terms of map-data locality, reduce-data locality, and network overhead without incur ring significant overhead. In addition, the two variations are separately suitable for different Map Reduce-work load scenarios
and provide the best job performance among all tested algorithmsExisting System:
Map Reduce enables a programmer to define a Map Reduce job as a map function and a reduce function, and provides a runtime system to divide the job into multiple map tasks and reduce tasks and perform these tasks on a Map Reduce cluster in parallel. Typically, a Map Reduce cluster consists of a set of commodity machines/nodes located on several racks and interconnected with each other in a local area network (LAN). Many task scheduling algorithms have been proposed to improve data locality and to shorten job turnaround time, but most of them only focus on scheduling map tasks, rather than scheduling reduce tasks. Hence, employing them in a virtual MapReduce cluster might cause a low reduce-data locality. Besides, most of current scheduling algorithms are designed to achieve the node locality and rack locality for conventional MapReduce clusters, rather than achieving the VPS-locality and Cenlocality for virtual MapReduce clusters. Consequently, adopting them in a virtual MapReduce cluster might be unable to provide a high map-data locality.
Problems in existing system:
1. Low reduce-data locality and map-data locality
2. Map Reduce cluster is costly for a person/organization with a limited budget, an alternative way is to establish a virtual Map Reduce cluster by either renting a Map Reduce framework from a Map Reduce service provider or renting multiple virtual private servers (VPSs) from a VPS provider.
Proposed System:
We propose a hybrid job-driven scheduling scheme (JoSS for short) by providing scheduling in
three levels: job, map task, and reduce task. JoSS classifies Map Reduce jobs into either large or small jobs based on each job’s input size to the average datacenter scale of the virtual Map Reduce cluster, and further classifies small Map Reduce jobs into either map-heavy or reduce-heavy based on the ratio between each job’s reduce-input size and the job’s map-input size. Then JoSS uses a particular scheduling policy to schedule each class of jobs such that the corresponding network traffic generated during job execution (especially for inter-datacenter traffic) can be reduced, and the corresponding job performance can be improved. In addition, we propose two variations of JoSS, named JoSS-T and JoSS-J, to guarantee a fast task assignment and to further increase the VPS-locality, respectively.Advantages in proposed system:
1. We introduce JoSS to appropriately schedule Map Reduce jobs in a virtual Map Reduce cluster by addressing both map-data locality and reduce-data locality from the perspective of a tenant.
2. By classifying jobs into map-heavy and reduce heavy jobs and designing the corresponding policies to schedule each class of jobs, JoSS increases data locality and improves job performance.
System Model

Figure1. An example showing the block locations of job Y in a virtual
Map Reduce cluster comprising three datacentersModules:
1. Input-Data Classifier
2. Task Scheduler
3. Task Assigner
Modules Description:
1. Input-Data Classifier
The input-data classifier is designed to classify input data uploaded by a user the input-data classifier can easily know if it is a web document or not.
2. Task Scheduler
Whenever receiving a Map Reduce job from a user, the task scheduler determines the type of the job and then schedules the job.
3. Task Assigner
The task assigner then determines how to assign a task to a VPS whenever the VPS has an idle slot.
System Requirements
Hardware Requirements:
· Processor - Pentium –IV
· Speed - 1.1 Ghz
· Ram - 256 Mb
· Hard Disk - 20 Gb
· Key Board - Standard Windows Keyboard
· Mouse - Two or Three Button Mouse
· Monitor - SVGA
Software Requirements:
· Operating System : Windows XP
· Coding Language : Java
- Detecting Node Failures in Mobile Wireless Networks: A Probabilistic Approach
×
Detecting Node Failures in Mobile Wireless Networks: A Probabilistic Approach
Abstract
Detecting node failures in mobile wireless networks is very challenging because the network topology can be highly dynamic, the network may not be always connected, and the resources are limited. In this paper, we take a probabilistic approach and propose two node failure detection schemes that systematically combine localized monitoring, location estimation and node collaboration. Extensive simulation results in both connected and disconnected networks demonstrate that our schemes achieve high failure detection rates (close to an upper bound) and low false positive rates, and incur low communication overhead. Compared to approaches that use centralized monitoring, our approach has up to 80 percent lower communication overhead, and only slightly lower detection rates and slightly higher false positive rates. In addition, our approach has the advantage that it is applicable to both connected and disconnected networks while centralized monitoring is only applicable to connected networks. Compared to other approaches that use localized monitoring, our approach has similar failure detection rates, up to 57 percent lower communication overhead and much lower false positive rates (e.g., 0.01 versus 0.27 in some settings).
EXISTING SYSTEM
Mobile wireless networks have been used for many mission critical applications, including search and rescue, environment monitoring, disaster relief, and military operations. Such mobile networks are typically formed in an ad-hoc manner, with either persistent or intermittent network connectivity. Nodes in such networks are vulnerable to failures due to battery drainage, hardware defects or a harsh environment. Detecting node failures is important for keeping tabs on the network. It is even more important when the mobile devices are carried by humans and are used as the main/only communication mechanism.
Disadvantages of Existing System:
1. Node failure detection in mobile wireless networks is very challenging because the network topology can be highly dynamic due to node movements. Therefore, techniques that are designed for static networks are not applicable.
2. The network may not always be connected. Therefore, approaches that rely on network connectivity have limited applicability.
3. The limited resources (computation, communication and battery life) demand that node failure detection must be performed in a resource conserving manner.
PROPOSED SYSTEM
In this paper, we propose a novel probabilistic approach that judiciously combines localized monitoring, location estimation and node collaboration to detect node failures in mobile wireless networks. Specifically, we propose two schemes. In the first scheme, when a node A cannot hear from a neighboring node B, it uses its own information about B and binary feedback from its neighbors to decide whether B has failed or not. In the second scheme, A gathers information from its neighbors, and uses the information jointly to make the decision.
Advantages of Proposed System:
1. Our approach has the advantage that it is applicable to both connected and disconnected networks.
2. Our schemes achieve high failure detection rates, low false positive rates, and low communication overhead.
SYSTEM ARCHITECTURE

MODULES
We have 3 main Modules.
1. Localized Monitoring Module
2. Location Estimation Module
3. Node Collaboration Module
Module Description:
Localized monitoring:
Localized monitoring only generates localized traffic and has been used successfully for node failure detection in static networks.
Location Estimation:
By localized monitoring, Node only knows that it can no longer hear from other neighbor nodes, but does not know whether the lack of messages is due to node failure or node moving out of the transmission range. Location estimation is helpful to resolve this ambiguity.
Node Collaboration:
Through this module, we can improve the decisions which are taken during Location estimation module.
SYSTEM CONFIGURATION
Hardware Configuration
· Processor - Pentium –IV
· Speed - 1.1 Ghz
· RAM - 256 MB(min)
· Hard Disk - 20 GB
· Key Board - Standard Windows Keyboard
· Mouse - Two or Three Button Mouse
Software Configuration
· Operating System : Windows XP
· Programming Language : JAVA
- View More
Networking Security
- Multi-Grained Block Management to Enhance the Space Utilization of File Systems on PCM Storages
×
Multi-Grained Block Management to Enhance the Space Utilization of File Systems on PCM Storages
ABSTRACT
Phase-change memory (PCM) is a promising candidate as a storage medium to resolve the performance gap between main memory and storage in battery-powered mobile computing systems. However, it is more expensive than flash memory, and thus introduces a more serious storage capacity issue for low-cost solutions. This issue is further exacerbated by the fact that existing file systems are usually designed to trade space utilization for performance over block-oriented storage devices. In this work, we propose a multi-grained block management strategy to improve the space utilization of file systems over PCM-based storage systems. By utilizing the byte-addressability and fast read/write feature of PCM, a methodology is proposed to dynamically allocate multiple sizes of blocks to fit the size of each file, so as to resolve the space fragmentation issue with minimized space and management overheads. The space utilization of file systems is analyzed with consideration of block sizes. A series of experiments was conducted to evaluate the efficacy of the proposed strategy, and the results show that the proposed strategy can significantly improve the space utilization of file systems.
EXISTING SYSTEM
PCM is rapidly developed as a promising candidate for next-generation storage class memory (SCM) because of its non-volatility, byte-addressability, and high access performance. In recent years, researchers have studied how to use PCM as main memory or hybrid main memory to improve the system performance and energy efficiency. Although some studies have explored the read/write asymmetry of PCM to further optimize the performance of PCM, using PCM as main memory suffers from the write endurance issue. Thus, some researchers have proposed different strategies to enhance the lifetime of PCM by the adoption of different write reduction or wear leveling techniques in different layers/components. In another direction, a few other researchers have exploited the byte-addressability of PCM to improve the data integrity of file systems, while others have tried to improve the write efficiency of file systems by utilizing the byte-addressability and read/write asymmetry of PCM.
Disadvantages of Existing System:
1. It is more expensive than flash memory, and thus introduces a more serious storage capacity issue for low-cost solutions
PROPOSED SYSTEM
In this work, we propose a multi-grained block management strategy to improve the space utilization of file systems over PCM-based storage systems with minimized space and management overheads. We are interested in inode-based file system designs due to the popularity of mobile computing systems installed with Linux-like operating systems. By utilizing the byte-addressability and fast read/write feature of PCM, the proposed strategy uses and manages multi-grained (or multiple sizes of) blocks for each file to allocate proper block sizes for data storage; thus, the fragmentation issue of inode-based file systems can be resolved. The proposed strategy supports dynamic inode allocation to dynamically allocate and reclaim inodes, so as to further resolve the fixed inode problem of existing inode-based file systems. To support dynamic inode allocation, an indirection map is designed to remove the search cost on searching files for inode reallocation during the reclamation of inode space. The proposed multi-grained block management strategy is evaluated with a space utilization analysis, and a series of experiments was conducted to evaluate the efficacy of the proposed strategy with different types of realistic file data. The results show that the proposed strategy can save significant storage space required by other investigated approaches.
Advantages of Proposed System:
1. Its objective is to allocate proper block sizes for each file so as to minimize the internal fragmentation issue imposed by existing file systems
2. Save the large storage space than existing works
SYSTEM ARCHITECTURE

MODULE
We have 2 modules,
1. Block reclamation Module
2. Dynamic inode allocation Module
Module Description:
Block reclamation
In the block reclamation mechanism, when a data block is freed by the movement/deletion of files, the data block can be directly reclaimed. If the released block is a sub-block, the multi-grained strategy checks whether any other occupied sub blocks exist in the same block. When the block does not contain any occupied sub-blocks, the data block is reclaimed as a free block. If the block contains other occupied sub-blocks, the proposed strategy maintains the released sub-block as a free sub-block in the sub-block bitmap.
Dynamic Inode Allocation:
Dynamic inode allocation is supported to resolve the external fragmentation issue caused by the fixed number of inodes in the traditional inode-based file systems. Its objective is to dynamically adjust the number of inodes at runtime so that the storage space allocated for inodes and for file contents can be balanced. This mode of allocation includes an inode translation table to manage inodes so that inodes can be distributed and stored in any block at runtime. Each entry of the inode translation table points to a block (called inode block) that stores consecutive inodes.
SYSTEM CONFIGURATION
Hardware Configuration
· Processor - Pentium –IV
· Speed - 1.1 Ghz
· RAM - 256 MB(min)
· Hard Disk - 20 GB
· Key Board - Standard Windows Keyboard
· Mouse - Two or Three Button Mouse
· Monitor - SVGA
Software Configuration
· Operating System : Windows XP
· Programming Language : JAVA
- Spatial Reusability-Aware Routing in Multi-Hop Wireless Networks
×
Spatial Reusability-Aware Routing in Multi-Hop Wireless Networks
ABSTRACT
In the problem of routing in multi-hop wireless networks, to achieve high end-to-end throughput, it is crucial to find the “best” path from the source node to the destination node. Although a large number of routing protocols have been proposed to find the path with minimum total transmission count/time for delivering a single packet, such transmission count/time minimizing protocols cannot be guaranteed to achieve maximum end-to-end throughput. In this paper, we argue that by carefully considering spatial reusability of the wireless communication media, we can tremendously improve the end-to-end throughput in multi-hop wireless networks. To support our argument, we propose spatial reusability-aware single-path routing (SASR) and anypath routing (SAAR) protocols, and compare them with existing single-path routing and anypath routing protocols, respectively. Our evaluation results show that our protocols significantly improve the end-to-end throughput compared with existing protocols. Specifically, for single-path routing, the median throughput gain is up to 60 percent, and for each source-destination pair, the throughput gain is as high as 5:3; for any path routing, the maximum per-flow throughput gain is 71.6 percent, while the median gain is up to 13.2 percent.
EXISTING SYSTEM
In recent years, a large number of routing protocols have been proposed for multihop wireless networks. However, a fundamental problem with existing wireless routing protocols is that minimizing the overall number (or time) of transmissions to deliver a single packet from a source node to a destination node does not necessarily maximize the end-to-end throughput.
Disadvantages of Existing System:
1. Most of existing routing protocols, no matter singlepath routing protocols or anypath routing protocols, rely on link-quality aware routing metrics, such as link transmission count-based metrics and link transmission time-based metrics
2. Most of the existing routing protocols do not take spatial reusability of the wireless communication media into account
PROPOSED SYSTEM
In this paper, we investigate two kinds of routing protocols, including single-path routing and anypath routing. The task of a single-path routing protocol is to select a cost minimizing path, along which the packets are delivered from the source node to the destination node. Recently, anypath routing appears as a novel routing technique exploiting the broadcast nature of wireless communication media to improve the end-to-end throughput. It aggregates the power of multiple relatively weak paths to form a strong path, by welcoming any intermediate node who overhears the packet to participate in packet forwarding.
Advantages of Proposed System:
1. We can achieve more significant end-to-end throughput gains under higher data rates
MODULES
We have 2 main modules,
1. Single-path Routing Module
2. Anypath Routing Module
Module Description:
Single-path Routing:
The task of a single-path routing protocol is to select a cost minimizing path, along which the packets are delivered from the source node to the destination node.
Anypath Routing:
This module aggregates the power of multiple relatively weak paths to form a strong path, by welcoming any intermediate node who overhears the packet to participate in packet forwarding.
SYSTEM CONFIGURATION
Hardware Configuration
· Processor - Pentium –IV
· Speed - 1.1 Ghz
· RAM - 256 MB(min)
· Hard Disk - 20 GB
· Key Board - Standard Windows Keyboard
· Mouse - Two or Three Button Mouse
Software Configuration
· Operating System : Windows XP
· Programming Language : JAVA
- Resource-Saving File Management Scheme for Online Video Provisioning on Content Delivery Networks
×
Resource-Saving File Management Scheme for Online Video Provisioning on Content Delivery Networks
ABSTRACT
Content delivery networks (CDNs) have been widely implemented to provide scalable cloud services. Such networks support resource pooling by allowing virtual machines or physical servers to be dynamically activated and deactivated according to current user demand. This paper examines online video replication and placement problems in CDNs. An effective video provisioning scheme must simultaneously (i) utilize system resources to reduce total energy consumption and (ii) limit replication overhead. We propose a scheme called adaptive data placement (ADP) that can dynamically place and reorganize video replicas among cache servers on subscribers’ arrival and departure. Both the analyses and simulation results show that ADP can reduce the number of activated cache servers with limited replication overhead. In addition, ADP’s performance is approximate to the optimal solution.
EXISTING SYSTEM
Content Delivery Networks (CDNs) are effective platforms for providing various types of services. Among them, on-demand video provisioning is a popular application, allowing numerous users to arbitrarily request videos from a massive database. Video websites such as YouTube, Vimeo, and DailyMotion are examples. When a visitor arrives and requests a video clip, the system must assign a serving CS and copy a replica of the clip from the backhaul database if the CS does not cache the clip for other visitors. Because each CS has limited capability, the total number of video clips it stores and the total outgoing bandwidth of subscribers it bears are limited by its space and bandwidth constraints.
Disadvantages of Existing System:
1. Because of the many time-variant requirements of video clips, intelligently placing videos among CSs and determine their serving subscribers without violating capacity and bandwidth limits is challenging. Typical CDN management schemes in data centers fail to address this video provisioning problem
PROPOSED SYSTEM
This paper introduces a new problem called resource-saving video placement (RSVP) and proposes a scheme called adaptive data placement (ADP). Through analysis and simulations, we demonstrate the two main advantages of ADP: (i) the worst case performance difference between ADP and the optimal solution can be guaranteed, and (ii) the replication overhead on each arrival or departure of a visitor is limited. Because ADP is based on common assumptions, it can be applied to various types of CDNs to improve their resource and power efficiency.
Advantages of Proposed System:
1. To achieve high resource utilization, our proposed scheme, ADP, follows three principles: (i) it maintains only one OPS server in a system to enable most CSs to achieve at least one aspect (i.e., bandwidth or space) of full utilization; (ii) it maintains the exclusiveness of video clips (i.e., allows at most one replica for each clip) among the OPS and SPF servers to improve space efficiency, which we demonstrate in the next section; and (iii) it conducts less physical replication to limit overhead.
SYSTEM ARCHITECTURE

MODULES
We have two Modules,
1. Arrive Module
2. Depart Module
Module Description:
Arrive:
This ARRIVE process provides fast responsiveness because the initiation of a new CS occurs only after, not on, a subscription’s arrival. Any incoming subscription can be placed into an activated CS (SPF or the OPS) in real-time if the arrival and departure processes of previous subscriptions are completed. To achieve an even shorter latency and address highly bursty demands, a buffer that maintains already activated CSs can be also considered.
Depart:
In this process, we reorganize the replica placement in CSs when a subscription leaves the system.
SYSTEM CONFIGURATION
Hardware Configuration
· Processor - Pentium –IV
· Speed - 1.1 Ghz
· RAM - 256 MB(min)
· Hard Disk - 20 GB
· Key Board - Standard Windows Keyboard
· Mouse - Two or Three Button Mouse
· Monitor - SVGA
Software Configuration
· Operating System : Windows XP
· Programming Language : JAVA
- View More
Parallel And Distributed System
- Hybrid Job-Driven Scheduling for Virtual Map Reduce Clusters
×
Hybrid Job-Driven Scheduling for Virtual Map Reduce Clusters
ABSTRACT:
It is cost-efficient for a tenant with a limited budget to establish a virtual Map Reduce cluster by renting multiple virtual private servers (VPSs) from a VPS provider. To provide an appropriate scheduling scheme for this type of computing environment, we propose in this paper a hybrid job-driven scheduling scheme (JoSS for short) from a tenant’s perspective. JoSS provides not only job level scheduling, but also map-task level scheduling and reduce-task level scheduling. JoSS classifies Map Reduce jobs based on job scale and job type and designs an appropriate scheduling policy to schedule each class of jobs. The goal is to improve data locality for both map tasks and reduce tasks, avoid job starvation, and improve job execution performance. Two variations of JoSS are further introduced to separately achieve a better map-data locality and a faster task assignment. We conduct extensive experiments to evaluate and compare the two variations with current scheduling algorithms supported by Hadoop. The results show that the two variations outperform the other tested algorithms in terms of map-data locality, reduce-data locality, and network overhead without incur ring significant overhead. In addition, the two variations are separately suitable for different Map Reduce-work load scenarios
and provide the best job performance among all tested algorithmsEXISTING SYSTEM:
Map Reduce enables a programmer to define a Map Reduce job as a map function and a reduce function, and provides a runtime system to divide the job into multiple map tasks and reduce tasks and perform these tasks on a Map Reduce cluster in parallel. Typically, a Map Reduce cluster consists of a set of commodity machines/nodes located on several racks and interconnected with each other in a local area network (LAN). Many task scheduling algorithms have been proposed to improve data locality and to shorten job turnaround time, but most of them only focus on scheduling map tasks, rather than scheduling reduce tasks. Hence, employing them in a virtual MapReduce cluster might cause a low reduce-data locality. Besides, most of current scheduling algorithms are designed to achieve the node locality and rack locality for conventional MapReduce clusters, rather than achieving the VPS-locality and Cenlocality for virtual MapReduce clusters. Consequently, adopting them in a virtual MapReduce cluster might be unable to provide a high map-data locality.
Problems in existing system:
1. Low reduce-data locality and map-data locality
2. Map Reduce cluster is costly for a person/organization with a limited budget, an alternative way is to establish a virtual Map Reduce cluster by either renting a Map Reduce framework from a Map Reduce service provider or renting multiple virtual private servers (VPSs) from a VPS provider.
We propose a hybrid job-driven scheduling scheme (JoSS for short) by providing scheduling in
three levels: job, map task, and reduce task. JoSS classifies Map Reduce jobs into either large or small jobs based on each job’s input size to the average datacenter scale of the virtual Map Reduce cluster, and further classifies small Map Reduce jobs into either map-heavy or reduce-heavy based on the ratio between each job’s reduce-input size and the job’s map-input size. Then JoSS uses a particular scheduling policy to schedule each class of jobs such that the corresponding network traffic generated during job execution (especially for inter-datacenter traffic) can be reduced, and the corresponding job performance can be improved. In addition, we propose two variations of JoSS, named JoSS-T and JoSS-J, to guarantee a fast task assignment and to further increase the VPS-locality, respectively.Advantages in proposed system:
1. We introduce JoSS to appropriately schedule Map Reduce jobs in a virtual Map Reduce cluster by addressing both map-data locality and reduce-data locality from the perspective of a tenant.
2. By classifying jobs into map-heavy and reduce heavy jobs and designing the corresponding policies to schedule each class of jobs, JoSS increases data locality and improves job performance.
System Model

Figure1. An example showing the block locations of job Y in a virtual
Map Reduce cluster comprising three datacentersModules:
1. Input-Data Classifier
2. Task Scheduler
3. Task Assigner
Modules Description:
1. Input-Data Classifier
The input-data classifier is designed to classify input data uploaded by a user the input-data classifier can easily know if it is a web document or not.
2. Task Scheduler
Whenever receiving a Map Reduce job from a user, the task scheduler determines the type of the job and then schedules the job.
3. Task Assigner
The task assigner then determines how to assign a task to a VPS whenever the VPS has an idle slot.
SYSTEM REQUIREMENTS
Hardware Requirements:
· Processor - Pentium –IV
· Speed - 1.1 Ghz
· Ram - 256 Mb
· Hard Disk - 20 Gb
· Key Board - Standard Windows Keyboard
· Mouse - Two or Three Button Mouse
· Monitor - SVGA
Software Requirements:
· Operating System : Windows XP
· Coding Language : Java
- Resource Allocation in Cloud Computing Using the Uncertainty Principle of Game Theory
×
Resource Allocation in Cloud Computing Using the Uncertainty Principle of Game Theory
ABSTRACT
Virtualization of resources on the cloud offers a scalable means of consuming services beyond the capabilities of small systems. In a cloud that offers infrastructure such as processor, memory, hard disk, etc., a coalition of virtual machines formed by grouping two or more may be needed. Economical management of cloud resources needs allocation strategies with minimum wastage,
while configuring services ahead of actual requests. We propose a resource allocation mechanism for machines on the cloud, based on the principles of coalition formation and the uncertainty principle of game theory. We compare the results of applying this mechanism with existing resource allocation methods that have been deployed on the cloud. We also show that this method of resource allocation by coalition-formation of the machines on the cloud leads not only to better resource utilization but also higher request satisfaction.EXISTING SYSTEM
Optimizing resource allocation to ensure the best performance can be done in many ways. Present IaaS service providers, largely unaware of application-level requirements, do not provide any optimization by configuring the required software on the VMs. Relying only on application level optimization is not sensible, as such is restricted to an existing infrastructure allocation. The placement of VMs is, however, in the hands of the IaaS provider and can be changed based on the topology of the machines in the cloud system. An application-level optimization technique along with topology-based VM placement—offers better chances of performance improvement
with lower resource wastage.Problems in existing system
1. The cloud providers’ current situation is that they know the type of VMs that may be requested but are unaware of the exact request specifications such as the number of instances of a particular type of VM.
2. Heavy resource wastage
In this paper, we model the cloud as a multi-agent system that is composed of agents (machines) with varied capabilities. Allocation of resources to perform specific tasks requires agents to form coalitions, as the resource requirements may be beyond the capabilities of any single agent (machine). Coalition formation is modeled as a game and uses the uncertainty principle of game theory to arrive at approximately optimal strategies of the game. We implement a resource allocation mechanism for the cloud that is demand-aware, topology-aware and uses a gametheoretic approach based on coalition formation of machines for requests with uncertain task information. With these ideas in place, we can use our agent-based resource allocation mechanism for the IaaS cloud. The evaluation of the efficacy of our approach is carried out by comparison with common commercial allocation strategies on the cloud. We evaluate it based on randomly generated VM requests that include data-intensive requests.
Advantages in proposed system
1. By solving the optimization problem of coalition formation, we avoid the complexities of integer programming.
2. The resource allocation mechanism, when deployed, is found to perform better with respect to lower task allocation time, lower resource wastage, and higher request satisfaction.
System Module

Modules
1. Resource Allocation Through Coalition Formation
2. Negotiation step
3. Task allocation
Module Description
1. Resource Allocation Through Coalition Formation
We assume that VM configurations are available with the cloud service provider. A typical IaaS client request for VMs; each host machine chooses a coalition (strategy)
from a set of feasible coalitions (strategies). Each coalition has a different payoff associated with it.2. Negotiation step
The host machines use their preference lists to check for the feasibility of coalitions in that order. Once an open coalition is assigned a task, the host machines remain part of the
coalition until the task is complete.3. Task allocation
The task allocator and the host machines have access to the knowledge base, which has the exhaustive list of the VM types that may be requested and the host machine configurations.
SYSTEM REQUIREMENTS
Hardware Requirements:
· Processor - Pentium –IV
· Speed - 1.1 Ghz
· Ram - 256 Mb
· Hard Disk - 20 Gb
· Key Board - Standard Windows Keyboard
· Mouse - Two or Three Button Mouse
· Monitor - SVGA
Software Requirements:
· Operating System : Windows XP
· Coding Language : Java
- A Secure and Dynamic Multi-Keyword Ranked Search Scheme over Encrypted Cloud Data
×
A Secure and Dynamic Multi-Keyword Ranked Search Scheme over Encrypted Cloud Data
ABSTRACT
Due to the increasing popularity of cloud computing, more and more data owners are motivated to outsource their data to cloud servers for great convenience and reduced cost in data management. However, sensitive data should be encrypted before outsourcing for privacy requirements, which obsoletes data utilization like keyword-based document retrieval. In this paper, we present a secure multi-keyword ranked search scheme over encrypted cloud data, which simultaneously supports dynamic update operations like deletion and insertion of documents. Specifically, the vector space model and the widely-used TF
IDF model are combined in the index construction and query
generation. We construct a special tree-based index structure and propose a
“Greedy Depth-first Search” algorithm to provide efficient multi-keyword ranked
search. The secure kNN algorithm is utilized to encrypt the index and query
vectors, and meanwhile ensure accurate relevance score calculation between
encrypted index and query vectors. In order to resist statistical attacks,
phantom terms are added to the index vector for blinding search results. Due to
the use of our special tree-based index structure, the proposed scheme can
achieve sub-linear search time and deal with the deletion and insertion of
documents flexibly. Extensive experiments are conducted to demonstrate the
efficiency of the proposed scheme.EXISTING SYSTEM
Existing techniques are keyword-based information retrieval, which are widely used on the plaintext data, cannot be directly applied on the encrypted data. Downloading all the data from the cloud and decrypt locally is obviously impractical. In order to address the above problem, researchers have designed some general-purpose solutions with fully-homomorphic encryption or oblivious RAMs. However, these methods are not practical due to their high computational overhead for both the cloud sever and user. On the contrary, more practical special-purpose solutions, such as searchable encryption (SE) schemes have made specific contributions in terms of efficiency, functionality and security. Searchable encryption schemes enable the client to store the encrypted data to the cloud and execute keyword search over cipher text domain.
Disadvantages:
1. The cloud service providers (CSPs) that keep the data for users may access user’s sensitive information without authorization.
2. Without encrypt the data, users directly upload the files into the cloud means cannot applied the encryption directly on data.
Proposed System:
This paper proposes a secure tree-based search scheme over the encrypted cloud data, which supports multi-keyword ranked search and dynamic operation on the document collection. Specifically, the vector space model and the widely-used “term frequency (TF) inverse document frequency (IDF)” model are combined in the index construction and query generation to provide multi-keyword ranked search. In order to obtain high search efficiency, we construct a tree-based index structure and propose a “Greedy Depth-first Search (GDFS)” algorithm based on this index tree. Due to the special structure of our tree-based index, the proposed search scheme can flexibly achieve sub-linear search time and deal with the deletion and insertion of documents. The secure kNN algorithm is utilized to encrypt the index and query vectors, and meanwhile ensure accurate relevance score calculation between encrypted index and query vectors. To resist different attacks in different threat models, we construct two secure search schemes: the basic dynamic multi-keyword ranked search (BDMRS) scheme in the known cipher text model, and the enhanced dynamic multi-keyword ranked search (EDMRS) scheme in the known background model.
Advantages:
1. We design a searchable encryption scheme that supports both the accurate multi-keyword ranked search and flexible dynamic operation on document collection.
2. The proposed scheme can achieve higher search efficiency by executing our “Greedy Depth-first Search” algorithm.
System Architecture:

Fig: The architecture of ranked search over encrypted cloud data
Modules
Data owner: Data owner is a collection of documents F = {f1, f2, ---, fn} that he wants to outsource to the cloud server in encrypted form while still keeping the capability to search on them for effective utilization. In our scheme, the data owner first builds a secure searchable tree index I from document collection F, and then generates an encrypted document collection C for F. Afterwards, the data owner outsources the encrypted collection C and the secure index I to the cloud server, and securely distributes the key information of trapdoor generation (including keyword IDF values) and document decryption to the authorized data users. The data owner is responsible for the update operation of his documents stored in the cloud server. While updating, the data owner generates the update information locally and sends it to the server.
Data users: User’s are authorized ones to access the documents of data owner. With t query keywords, the authorized user can generate a trapdoor TD according to search control mechanisms to fetch k encrypted documents from cloud server. Then, the data user can decrypt the documents with the shared secret key.
Cloud server: The cloud server stores the encrypted document collection C and the encrypted searchable tree index I for data owner. Upon receiving the trapdoor TD from the data user, the cloud server executes search over the index tree I, and finally returns the corresponding collection of top-k ranked encrypted documents. Besides, upon receiving the update information from the data owner, the server needs to update the index I and document collection C according to the received information. The cloud server in the proposed scheme is considered as “honest-but-curious”, which is employed by lots of works on secure cloud data search.
SYSTEM CONFIGURATION
Hardware Configuration
· Processor - Pentium –IV
· Speed - 1.1 GHz
· RAM - 256 MB(min)
· Hard Disk - 20 GB
· Key Board - Standard Windows Keyboard
· Mouse - Two or Three Button Mouse
· Monitor - SVGA
Software Configuration
· Operating System : Windows XP
· Programming Language : JAVA
- View More
matlab_publication
- Face Recognition Using Principal Component Analysis
×
Face Recognition Using Principal Component Analysis
- Energy-Efficient Configuration of Spatial and Frequency Resources in MIMO-OFDMA Systems
×
Energy-Efficient Configuration of Spatial and Frequency Resources in MIMO-OFDMA Systems
ABSTRACT
In this paper, we proposed adaptive configuration of spatial and frequency resources to maximize energy efficiency (EE) and reveal the relationship between the spectral efficiency (SE) and the EE in downlink multiple-input-multiple-output (MIMO) orthogonal frequency division multiple access (OFDMA) systems. The problem is formulated as minimizing the total power consumed at the base station under constraints on the average data rates from multiple users, the total number of subcarriers, and the number of radio frequency (RF) chains. A two-step searching algorithm is developed to solve this problem, which first finds the near-optimal numbers of subcarriers for multiple users based on Karush-Kuhn-Tucker (KKT) conditions and then optimize the number of active RF chains. Simulation results demonstrate that increasing frequency resource improves both the SE and the EE, and it is more efficient than increasing spatial resource. Consequently, there is tradeoff between the SE and the EE only when the frequency resource is limited. In general, the adaptive configuration of spatial and frequency resources outperforms the adaptive configuration of only spatial resource and that of only frequency resource.
INTRODUCTION
Multiple-input-multiple-output (MIMO) orthogonal frequency division multiple access (OFDMA) systems are very popular these days owing to high spectral efficiency (SE).However, whether they are with high energy efficiency (EE) is not clear. Although MIMO requires minimum transmit power than single-input-single-output (SISO) for the same data rate, it takes more circuit power because more active transmit or receive radio frequency (RF) chains are used [1].
On the other hand, in MIMO-OFDMA systems, spatial precoding and other baseband processing are carried out at each subcarrier and thus the circuit power consumption on processing increases with the number of subcarriers. Since signal processing becomes more complicated due to high need on the data rate and transmission reliability, we cannot neglect the circuit power taken by both spatial and frequency resources besides the transmit power consumption when designing an energy efficient MIMO-OFDMA system.There are some preliminary results on energy saving by adaptively using the spatial and frequency resources. The EE of Alamouti diversity scheme is discussed in [1]. It has been shown that if modulation order is adaptively adjusted to balance the transmit and circuit power consumption, multiple-input-single-output always do better than SISO.
Adaptive switching between MIMO and single-input multiple output modes are addressed in [2] to save the energy in uplink cellular networks. The relationship between the EE and bandwidth is investigated in [3] and [4]. The EE has been shown to increase with bandwidth if the circuit power consumption either does not depend on or linearly increases with the bandwidth. Energy-efficient link adaptation for MIMO-OFDM systems is studied in [5], where the active RF chains, the overall bandwidth, MIMO transmission modes can be adjusted according to the data rate need and channel fading.Priori work mainly does focus on point-to-point MIMO transmission. In downlink MIMO-OFDMA networks, RF chains are shared by different users. In this scenario switching on or switching off RF chains and allocating bandwidth are intertwined, that makes it complicated to research the EE. In this paper, we study adaptive configuration of spatial and frequency resources to reduce the EE in downlink MIMO-OFDMA systems.
SYSTEM MODEL
Consider a downlink MIMO-OFDMA system with one base station (BS) and M users. Nt and Nr RF chains are configured at the BS and each user side, respectively. Overall K subcarriers are shared by multiple users without overlap. Since a large portion of power is consumed by the BS during downlink transmission [3], we concern about how to save energy at the BS side. Consider that the number of active RF chains at the BS and the number of subcarriers allocated to each user can be adjusted based on the data rates required by the users.
A normal development structure of MIMO-OFDMA systems is represented in Fig. 1. The data first pass the channel coding and modulation mapping unit and then mapped into complex symbols. After spatial processing in the MIMO encoder unit, the signals are outputted to nt active RF chains. Different OFDM operations are done on every RF branch, including series to parallel converting (S/P), inverse fast fourier transform (IFFT), and parallel to series converting (P/S). After digital processing, the analog signals generated by the digital to analog converter (D/A) and are filtered and up-converted to a high frequency band. Finally, the signals are amplified by the power amplifiers (PAs) and radiated to the air. ISSN No: 2348-4845 Volume No: 2 (2015), Issue No: 7 (July)
POWER CONSUMPTION AT BASE STATION
The total power taken by the BS consists of transmit power and circuit power. The transmit power consumption is contributed by the PAs at RF chains. Represent ρ, Pi as the efficiency of the PAs and the transmit power for user i per subcarrier and per RF chain, respectively. Then the transmit power consumption can be expressed as
Besides a fixed circuit power consumption to keep operations of the BS, circuit power consumptions from different components depend on different system parameters. For example, circuit power consumption from the channel coding and modulation mapping unit is in proportion to the data rate [6]. The circuit power consumptions for different components are described in Table I. Based on (1) and the circuit power consumption models, the total power consumption of the BS can be represented as follow
TABLE 1 : CIRCUIT POWER CONSUMPTIONS OF DIFFERENT COMPONENTS OF BS:
When the data rates of multiple users are given, maximizing the EE is equivalent to minimizing the total power consumption at the BS. Considering the constraints on the total number of subcarriers and the number of active RF chains, the optimization problem can be formulated as follows,
ERGODIC CAPACITY


ENERGY EFFICIENCY OPTIMIZATION
The EE in downlink transmission is defined as the overall average number of bits transmitted from the BS per unit energy [9], and is equal to the sum of the average capacities of multiple users per unit power. From the total power consumption in (2), we can get the EE of the downlink MIMO-OFDMA network as
The SE in downlink transmission, which is defined as the overall average data rate per unit bandwidth, does depend on the data rates of multiple users. To study the SE-EE relationship, we formulate a problem to maximize the EE under the constraints of average data rate deed of multiple users.
Substituting (10) into (11) and after some manipulations, (11a) can be rewritten as
TWO-STEP SEARCHING ALGORITHM: A. Solution of Problem (9) Given the Number of Active RF Chains
A.1 Solution of Continuous Numbers of SubcarriersWhen the numbers of subcarriers used by different users, {ki}Mi=1, are relaxed to continuous variables, they can be expressed as a function of {Pi}Mi=1 from constraint (7a) as ki = Ci/f (nt, ωiPi) i = 1, 2, • • • ,M. (8) Substituting (8) into problem (7) and considering that constraint (7c) can be discarded automatically for a given nt, we can obtain a new optimization problem as follows
The Lagrange function of problem (9) is shown in (10) on the top of next page, where Φ(P) denotes the objective function in problem (9), and λ and {ξi}Mi=1 represent Lagrange multipliers for constraints (9a) and (9b), respectively. The Karush-Kuhn-Tucker (KKT) conditions of problem (9) can be expressed as follows
SIMULATION RESULTS

CONCLUSION
We first formulated the optimization problem to minimize the total power consumed at the BS with average data rate requirements from multiple users. Then we developed a two-step searching algorithm. Simulation results indicate that increasing frequency resource helps to improve both the SE and the EE. The tradeoff between the SE and the EE only exists when the total number of active subcarriers is restricted by a maximum value. On the other hand, the optimal number of active RF chains increases only when the total number of used subcarriers cannot be increased, which means that frequency resource is more efficient than spatial resource on improving the EE. The proposed spatial-frequency resource adaptive configuration outperforms both the spatial-only-adaptation and the frequency-only-adaptation.
REFERENCES
[1]S. Cui, A. J. Goldsmith, and A. Bahai, “Energy-efficiency of MIMO and cooperative MIMO techniques in sensor networks,” IEEE J. Select.Areas Commun., vol. 22, no. 6, pp. 1089–1098, Aug. 2004. [2]H. Kim, C.-B. Chae, G. de Veciana, and J. R. W. Heath, “A cross-layer approach to energy efficiency for adaptive MIMO systems exploiting sparecapacity ,” IEEE Trans. Wireless Commun., vol. 8, no. 8, pp. 4264 – 4275, Aug. 2009.