ABSTRACT

In this paper, we propose a feature-free method for detecting phishing websites using the Normalized Compression Distance (NCD), a parameter-free similarity measure which computes the similarity of two websites by compressing them, thus eliminating the need to perform any feature extraction. It also removes any dependence on a specific set of website features. This method examines the HTML of webpages and computes their similarity with known phishing websites, in order to classify them. We use the Furthest Point First algorithm to perform phishing prototype extractions, in order to select instances that are representative of a cluster of phishing webpages. We also introduce the use of an incremental learning algorithm as a framework for continuous and adaptive detection without extracting new features when concept drift occurs. On a large dataset, our proposed method significantly outperforms previous methods in detecting phishing websites, with an AUC score of 98.68%, a high true positive rate (TPR) of around 90%, while maintaining a low false positive rate (FPR) of 0.58%. Our approach uses prototypes, eliminating the need to retain long term data in the future, and is feasible to deploy in real systems with a processing time of roughly 0.3 seconds.

EXISTING SYSTEM

Malicious Web sites are the basis of most of the criminal activities over the internet.
The dangers that arise due to the malicious sites are enormous and the end-users must be prohibited from visiting such sites.
The users should prohibit themselves from clicking on such Uniform Resource Locator (URL).
In order to prevent such attacks, the paper proposes the use of machine learning algorithms to detect
Phishing Websites. The Existing PWD (Phishing Website Detection) model is trained using an existing dataset which contains URLs, each with unique features, and is applied to three different
machine learning classififiers—support vector machine, logistic regression and Naïve Bayes. After training and testing the algorithms, it is observed that Naïve Bayes classififier recorded the highest accuracy

DISADVANTAGES

Low Accuracy Due to Training Loss
Many Website features not included for the consideration

PROPOSED SYSTEM

Collect dataset containing phishing and legitimate websites from the open source platforms.
Write a code to extract the required features from the URL database.
Analyze and preprocess the dataset by using EDA techniques.
Divide the dataset into training and testing sets.
Run selected machine learning and deep neural network algorithm (DNN) on the dataset.
Write a code for displaying the evaluation result considering accuracy metrics.
Compare the obtained results for trained models and specify which is better.
DNNThis is also one of the classification algorithm which is supervised and is easy to use. It can used for both classification and regression applications, but it is more famous to be used in classification applications. In this algorithm each point which is a data item is plotted in a dimensional space, this space is also known as n dimensional plane, where the ‘n’ represents the number of features of the data. The classification is done based on the differentiation in the classes, these classes are data set points present in different planes.
ADVANTAGES
- -Provide clear idea about the effective level of each classifier on phishing email detection
- -High level of accuracy by take the advantages of classifiers many
- – High level of accuracy.
- Fast in classification process fast ,less consuming memory, high accuracy, Evolving with time, online working
HARDWARE SOFTWARE REQUIREMENTS

Software Requirements:

Front End – Anaconda IDE
Backend – SQL
Language – Python 3.8

Hardware Requirements:

Hard Disk: Greater than 500 GB
RAM: Greater than 4 GB
Processor: I3 and Above

PROJECTS VIDEO

Including Packages =======================

* Base Paper

* Complete Source Code

* Complete Documentation

* Complete Presentation Slides

* Flow Diagram

* Database File

* Screenshots

* Execution Procedure

* Readme File

* Addons

* Video Tutorials

18 Comments

AKIN SICHILIMA on October 19, 2022 at 5:42 pm

i want this project.
- admin on February 3, 2023 at 10:40 am
  
  You contact us Regarding Project Purchase
  - Soham on March 27, 2023 at 7:38 am
    
    I want to enquire regarding the project how can we get to you
    - Anil on July 10, 2024 at 3:06 pm
      
      I want this Phishing Website Detection using Machine Learning project
Amit Mondal on December 5, 2022 at 8:18 am

How to get this project ?
- admin on February 3, 2023 at 10:41 am
  
  Please contact xpertieee@gmail.com
Sidharth p on December 11, 2022 at 4:48 pm

Plz sent me this project
- admin on February 3, 2023 at 10:41 am
  
  Please call 9566492473
JAISANJAY on February 1, 2023 at 4:30 pm

Please send source code …helpful in my project
- admin on February 3, 2023 at 10:43 am
  
  Please call 9566492473
Nikitha Panda on February 8, 2023 at 7:55 am

I need this project
sumanth reddy on March 25, 2023 at 5:27 am

i want these project
sangeetha on March 29, 2023 at 6:31 am

sir i want Base paper for this project….Phishing Website Detection using Machine Learning….today afternoon review please send as soon as possible … Thankyou sir
atharva jhawar on April 12, 2023 at 8:16 am

project
- Maham Abbas on January 10, 2024 at 4:20 pm
  
  Complete source code of phishing detection websites
Gaurav Sardar on October 29, 2023 at 3:21 pm

i want this project.
Karthikeya on December 7, 2023 at 1:07 pm

I want this project
- admin on December 20, 2023 at 5:56 am
  
  Its available for purchase cost 4000 Rs overall

Phishing Website Detection using Machine Learning

18 Comments

Submit a Comment Cancel reply