Anonymizing packet header fields


Assumptions:
1. The fact that an application is being used in a network is not sensitive if the real ip address of the user is not revealed.

Type Field How this can be used to infer private info Importance in attacks Anonymize?
IPv4 header Source / Dest IP address Real IP addresses can be easily associated with individuals. Reveals communication between 2 hosts. In DDoS attacks, patterns in spoofed ip addresses can reveal an attack. Response of network due to attack depends on the relationship between nodes, topology of the network. Yes, but preserve prefixes
Version Not sensitive. There does not seem to be any attack that can be distinguished by the IP version alone. Can be part of an attack signature. User- defined
Header length Application -> OS, but only if header length is unique, and application OS-specific. There does not seem to be any attack that can be distinguished by header length alone. Can be part of an attack signature. User- defined
Type of Service: Precedence Router:
Differences in ICMP error messages' precedence field can reveal type of router.
lists.insecure.org/lists/nmap-hackers/ 2000/Oct-Dec/0009.html
www.phrack.org/show.php?p=57&a=7

Host:
Application->OS. depends on whether any applications use precedence field. Not commonly used.
Can be part of an attack signature. User- defined
Type of Service (RFC 1349) Can infer protocol
www.iana.org/assignments/ip-parameters,
Used in ICMP datagrams->OS fingerprinting
lists.insecure.org/lists/nmap-hackers/ 2000/Jul-Sep/0037.html
Part of attack signature, may not be crucial to the attack's execution (stacheldraht)
www.geocrawler.com/archives/ 3/4890/2001/3/0/5457179/
User- defined
Total length Upper protocol, Type of data sent / Protocol -> Application -> OS, but only if length is unique and application is OS-specific. Attacks may be identified by the upper protocol or size of packet User- defined
Total length - Header Length Upper protocol, Type of data sent->Application->OS Can be part of an attack signature. Attacks may be identified by size of payload eg. Tiny fragment attack www.zvon.org/tmRFC/RFC1858/ Output/chapter3.html User- defined
Identification aids reassembly of fragments only if payload & frag offset known Can be part of an attack signature. Fragment attacks (jolt2.c), where a stream of identical packets is sent. www.securityfocus.com/ archive/1/62011/2002-11-29/ 2002-12-05/0 User- defined
Flags Can help deduce application, but may not be unique enough for much info to be gained from knowing this field's value. Part of attack signature for known attacks. Overlapping fragment attack (to bypass firewalls) www.zvon.org/tmRFC/ RFC1858/Output/chapter4.html Do not anonymize
Fragment offset enables reassembly of fragmented packet, if payload is known Fragment attacks (jolt2.c), where a stream of identical packets is sent. Can be part of attack signature. www.securityfocus.com/ archive/1/62011/2002-11-29/ 2002-12-05/0 Do not anonymize
Time to Live Estimate of how many hops the packet has travelled since it left the original source.->info about routing at that point in time. Can be part of an attack signature. Attacks can be identified in part by the initial TTL.(Tiny fragment attack) User- defined
Protocol Protocol -> Application -> OS Can be part of an attack signature. Attacks may be identified by upper protocol. User- defined
Header checksum Some OS's have incorrect implementation of checksum (OS fingerprinting) www.phrack.org/show.php?p=57&a=7 (phrack_icmp_osfingerprint.html)

If all fields are known except address fields, checksum may be able to rule out some sets of possible ip addresses.(peuhkuri)
Can be part of an attack signature. Attack packets may have checksum set to 0, which should be discarded by a router but may cause it to crash. (0 allowed for UDP, ICMP headers, but not for IP)
www.securityfocus.com/ archive/1/62011/2002-11-29/ 2002-12-05/0
No
Options:
Loose source & record route (LSRR), Strict source & record route (SSRR):
-allows source to supply routing info to be used by gateways in forwarding packet to destination, and record route info
Record Route:
-records route of packet
Security:
Internet Timestamp: timestamp, internet address of registering entity
Can reveal routing information, revealing the source of a packet. Not sensitive if real IP addresses are not revealed. Could not find much info on how commonly used these fields are. IP packets with invalid IP option settings can cause denial of service.
www.cisco.com/warp/public/ 707/vpn3k-ipoptions-vuln-pub.shtml
User- defined
Padding May be used in OS fingerprinting. If dirty buffer is used for padding, may reveal OS, other information. Could be used in attack packets. Do not anonymize
TCP header Source/Dest port Application, OS (if application only runs on a specific OS) An attacker may be able to get through a firewall by using a particular port. If this port number also appears in legitimate packets in the trace, this could have an effect on the false positive rate. Part of firewall / IDS rules. Do not anonymize
Sequence No./Ack No. Can identify OS by looking at how predictable the sequence numbers are. Snort signature #241 (DDoS Shaft syn flood), FreeBSD TCP RST DoS ciac.llnl.gov/ciac/ bulletins/j-008.shtml Do not anonymize
Data offset Size of TCP header->Application(tells us whether Options are present) If attack consists of TCP headers of an unusual size. Could be part of attack signature. Do not anonymize
Reserved (flags) (ECE, CWE, see RFC 3168) Congestion of the network at the time trace was taken. Not sensitive. FreeBSD "ipfw/ip6fw" vulnerability http://www.ciac.org/ciac/ bulletins/l-029.shtml Do not anonymize
Flags: URG, ACK, PSH, RST, SYN, FIN Upper protocol/Application Snort signature #241 (DDoS shaft syn flood) Do not anonymize
Window Congestion info at the time trace was taken. Not sensitive. Can be part of an attack signature. User- defined
Checksum If all fields except ip address are known, checksum may be able to rule out some sets of possible ip addresses. (peuhkuri) Some OS's may compute this incorrectly (OS fingerprinting). Can be part of an attack signature.
Urgent ptr Can narrow down possible applications if only certain applications use this field. Can be part of an attack signature.
Options
Padding
UDP header Source/Dest port see TCP above see TCP above see TCP above
Length Can deduce type of packet/protocol -> application -> OS Can be part of an attack signature.
Checksum If all fields in header except ip addresses are known, checksum can narrow down the range of possible ip addresses. Part of attack signature of known attacks. Attack packets may have checksum set to 0, which should be discarded by a router but may cause it to crash.
HTTP header Method
URL Sensitive. Bad url (remote execution of code) could be part of an attack. Anonymize (selectively?)
Version Not sensitive. Do not anonymize
Header field(s): Date, Server, Last-Modified, Content-Length, Content-type Server -> OS Can be part of an attack signature.
SMTP header (RFC 821) port 25 Sender/Receiver, Server/Client name Sensitive. Anonymize
Message Sensitive. Can be part of an attack signature. Anonymize - selectively?
ICMP ICMP ID Reveals info about routers.? Part of attack signature.


IDS rules:
----------
Bro
-port TCP/UDP
-IP addr
-content eg filename(ftp), user(finger,ftp), ftp parses PORT requests
-time, time interval

Snort
-port TCP/UDP
-IP addr (local or nonlocal)
-content (strings in content)

NFR
-pkt filtering n-code language

Points to note:
---------------
Need to consider: Outside contextual info + Info from trace
Info from trace may not seem significant, but if put together with other info from gained by other means, can
help an attacker. So should only reveal what is necessary, and nothing more.

If the network the trace was taken from is given, could analyze patterns in trace and patterns in actual
network based on attacker's own measurements, to identify real ip addresses.

Measurements may look different depending on where they are taken from. Where to take measurements from?

Attacks may be present in the trace eg. scans, which can reveal info about OS, open ports.

Payload: anonymize selectively? Snort's attack signatures depend on text in payload, but if we remove payload 
in background traffic, could affect false positive measurements

Sensitive data:
---------------
-Directly sensitive
   -Real IP address
   -payload data (userid/pword,encrypted info(credit card no.),encrypted msg,form data,messages(email,im),
                  filenames,file contents,url name)
-Indirectly sensitive (how much additional external info must a hacker have for this field to reveal sensitive information?)

   Source               | Compromised information           |   Additional information that the attacker must have
   ICMP error pkt         OS              
   IP Prefixes            Network topology
   TCP packet             OS
   IP freq analysis       real IP addresses of popular web 
                          servers
   traffic flow analysis  Distinguish clients from servers
   dns queries/responses                                        Knowledge of DNS Hierarchy
   traffic analysis       Real IP addresses                     Knowledge of which network the trace was taken from,
                                                                ability to trace packets on the network
   traffic analysis       Whether an individual has been        Real IP address corresponding to an anonymized address,
                                                                real IP addresses corresponding
                          communicating with certain people     to anonymized addresses of others.


Deduction paths:
----------------
OS->individual
App->OS
Port no.->App
Protocol->Apps

Links:
------
TBIT, the TCP Behavior Inference Tool (Differences in TCP implementations) http://www.icir.org/tbit
TCP header flags (see RFC 3168) http://www.iana.org/assignments/tcp-header-flags
Common Vulnerabilities and Exposures http://www.cve.mitre.org
Google: "header checksum attack"
Rank = Effect of exposure * How easy to get additional info needed

Rank: 100 = Must anonymize
0 = Anonymization not needed

Source Information deduced Additional info needed
Prefixes of anonymized addresses network topology None
Mix of applications used on network (port no.s) Which network the trace was taken from Mix may not be unique, but with other contextual clues the network could be identified eg. geographical region
Packet header checksums Narrowed down range of possible real IP addresses None
Packet header checksums Real IP addresses Known range of real IP addresses
Appearance of DNS requests / queries in trace Real IP addresses of DNS servers Knowledge of DNS hierarchy
Traffic analysis Real IP addresses Knowledge of normal traffic patterns of the network that the trace was taken from
OS Real IP address Map of OS's, IP addresses in the network
Application used by a user, Operating system used by a user Differences in implementation of standardized network protocols None
Frequency analysis of anonymized IP addresses Popular web servers Typical volume handled by popular web servers, popular web sites