99% of the time, when someone is talking about proxies, they’re just talking about normal, regular, every-day proxy connections. There are other types of proxies, though, and it would be a fundamental mistake to think they all work the same way. It’s time for some education.
How a Normal Proxy Works
A standard proxy server is the default from which other types of proxies are derived. It’s very simple in essence, though it has a lot of diﬀerent possible configuration options. VPNs are a variation that works in much the same way, for example.
A proxy is essentially just a web server that sits between you and the website you want to reach. Normally when you want to connect to a website, you send a request from your browser. This request goes out through your modem and router, to a street-level router, to a neighborhood router, to an ISP server, to an internet backbone, and then in the reverse down to the location of the server you’re connecting to.
Often, website servers are located with direct connections to internet backbones, so you don’t have to worry about the neighborhood or local routers getting in the way and adding latency.
A proxy server simply sits somewhere in the middle. Usually, the physical location of the proxy is in a foreign country, so a proxy connection might jump the ocean, go to a satellite, or have some wireless transmission somewhere along the way.
A proxy server doesn’t have to be a dedicated server or piece of high powered hardware. In fact, with the right configuration, any computer can be turned into a proxy server. This further adds to the latency of a proxy server, but more on that later.
In our sample, your direct connection goes from your computer, through a bunch of routers, and into the web host server where the website resides. It gets the data you want, then returns the way it came, to give you that information. The information can be anything from the page loading to downloading a file to a web app. The variation is infinite.
With a proxy, the connection leaves your computer and goes through routers until it reaches the proxy. The proxy then strips some identifying data from your request and fills it with its own, or doesn’t. It forwards the connection along to the web server, through routers and connections once again. The web server answers the request – recording the altered data from the proxy rather than your data – and the data goes back to the proxy server. The proxy forwards the data back to you.
This by physical necessity adds latency to your connection, but the benefit you get out of it is one of privacy. By stripping referral data, websites don’t see where your connection is really coming from. There are diﬀerent levels of anonymity:
- Anonymous: removes as much referral data as possible so that the web server receives no information about who you are.
- Distorting: removes some data and changes other data, so that the web server gets some information, but it may not be correct.
- Transparent: does not remove or hide your information. These proxies are often used to forward or alter data rather than connection information.
There is also Tor, the Onion Router, which is a semi-anonymous network of randomized proxies through which your connection passes before reaching its destination. There are privacy issues with Tor as well, but it’s better than a transparent proxy. Tor is also very similar to a backconnect proxy system, in many ways.
Standard proxy servers can be configured in a number of ways. They can be public, open to anyone to use, or private, restricted to people who pay for access. They can pass data through unaltered, or they can add in ads or display your content in an iframe. They can be fast or slow. They can be located in the USA, Britain, Germany, India, Ukraine, or anywhere else in the world. They can connect using HTTP or one of the SOCKS protocols.
Proxies, in general, are useful for a certain minimal level of security and anonymity, but one proxy on its own is going to do very little to maintain your privacy if you’re doing something anyone wants to monitor. It might protect you from neighbor Bob’s 15-year-old script kiddy nephew, but it won’t protect you from the NSA or the FBI.
Using more secure proxies is the realm of something like Tor or a backconnect proxy.
These are somewhat more secure, but they’re more useful for something like bulk data scraping, not casual everyday browsing.
In fact, the number one drawback with most proxies is how much slower it tends to make your connection. It doesn’t matter what proxy you’re using, what protocol it’s using, how powerful the machine is, or where it’s located; a proxy is an additional step. If you add six rungs to a ladder, that ladder is taller and takes more time and eﬀort to reach the top, no matter how fast you are at climbing, where you position it, or what technique you use to climb.
How a Backconnect Proxy Works
A backconnect proxy is similar to Tor, which I’ve said a few times already. It’s not Tor, though; Tor is its own strange beast.
A backconnect proxy essentially works just like a normal proxy. You have all of the same configurations and all of the same benefits and drawbacks, for the most part. The primary diﬀerence is in the proxy server itself, or, rather, the network.
A backconnect proxy is not a single proxy server machine, forwarding data and serving it up with diﬀerent referrers or headers. Instead, it’s a swarm of diﬀerent machines and diﬀerent configurations. You might have a pool of 20, or 200, or 50,000, and it can scale up as much as the owner of the backconnect system wants to add to the swarm. Well, that, and how much you’re willing to pay. You’ll understand why momentarily.
The purpose of a backconnect proxy is to change your referrer data and IP, much like with a normal proxy server. The diﬀerence is, with a proxy server, you’re changing from A to B. Every connection you make through the proxy comes out as B. Even if you’re using a small rotating list of proxies, it might be B for 10 minutes, then C for 10 minutes, then D for 10 minutes, then back to B. There are footprints, patterns, and cycles people can monitor to see what’s going on.
With a backconnect proxy swarm, every time you send out a connection, the proxy connection changes. If you type in www.google.com you will get connection A returning Google’s homepage. You type in a search query and connection B sends it out, and connection C brings it back. You click on a search result and connection D sends it out, connection E brings back the page data, and so on.
The point is, the connection changes every action you take. Some backconnect proxy providers talk about having a database of millions of available proxies and tens of thousands of them available on any given day. They allow you to pretend to be in specific geographical regions, filter our the connections you don’t want to see, and support various protocols.
Advantages of backconnect proxy
The primary advantage of a backconnect proxy swarm is that you can make more requests per minute than you can use any one connection. Sites like Google, Amazon, or any site you want to scrape data from, typically have rate limits based on IP. If Google sees too many connections coming from server A, they will put a captcha on the actions involved. Even if you pass it, too many connections will get your rate limited for minutes at a time.
Every connection you send out goes through a diﬀerent IP address and port, and reaches Google from a diﬀerent connection. Google won’t be able to group them together to impose a rate limit; if they add a rate limit to one IP, it doesn’t matter. The next connection you request is from a diﬀerent IP anyways.
Drawbacks of backconnect proxy
On the other hand, because a backconnect proxy connection is actually potentially hundreds or thousands of diﬀerent connections, it has two major drawbacks.
The first is the expense; these lists tend to be very expensive to gain access to.
The second is wildly varying quality. Some connections will be fast, some will be slow, and some might be banned by the service you’re trying to access. You have very little ability to control them.
Tor works in a similar way as a backconnect proxy, except that Tor bounces your connection around between internal proxies rather than through one that changes. With a backconnect proxy, you always just have one proxy between you and the destination. With Tor, you have an undefined number from inside their cloud.