BSP isn't really complex. The only problem is that when you compile a BSP from your world data, you have to cut any polys that cross partition planes. As the density of polys increases, that means more and more cutting, which can cause problems. This is why it's useful to optimize BSP compiler, and that can be a bit complicated. But in its most basic implementation, BSP is really easy to write. I think it's actually easier than octrees.
Besides, with octrees you have similar problems. The method only works if each node belongs to only one partition. So any polys that cross boundaries in the octree will also have to be sliced. So eventually, you still have to optimize the compiler, and you do lose some optimizations available to BSP.
That aside, yes, both BSP and octrees can be used for frustum culling and order-dependent transparency. You can also do some collision test optimizations with both.
Using bounding box for occlusion tests only makes sense if your objects tend to fill-out the bounding box. And even then, it can lead to some artifacts. Similarly, LOD models can also lead to some artifacts, and you have to waste a pass to do the detection. What you basically do is render LOD scene to an index buffer, which you then use as a mask. It doesn't reduce amount of geometry you have to render, but it reduces number of pixel operations, and that's the most expensive part of modern 3D engines.
For example, say you are rendering a city with buildings you can actually enter. Now, interiors might be pretty detailed, so you wish to reduce the load on rendering exteriors. You have a huge chunk of the city in your frustum, but you might only see a tiny bit through the windows. Well, render the bare walls, low-poly, of the building you are in to the index buffer. You can have index cleared to zero at first, for example, then render the rooms as 1s. Then use this as a mask while rendering the city outside. You'd still do frustum culling for geometry, but even in frustum, only the pixels which correspond to zero on your index buffer will make it to the pixel shader stage and require texturing and lighting. Huge savings. Finally, you render interior of the building without paying attention to the mask. Rendering interior after exterior will help you with any transparent objects. (e.g. windows) One more thing you can do while using this technique is render exterior and interior using different depth buffer settings. Source does this for the 3D skyboxes. This lets you make better use of the depth buffer for distant objects. (P.S. Note that "interior" could be very loosely defined. You might have situations where a narrow alley, or even a street, is useful to count as interior. Rendering low-poly version of nearby buildings to the index buffer might significantly reduce the amount of rendering you have to do on the rest of the city. But you have to be able to tell interior from exterior really easily here. Octrees can help. But a more general convex geometry approach might be better.)
Another example of checking for occlusions with index maps is the shadow volumes. Technique is popularized by Carmack of Id Software fame. You start out by building shadow volumes. You take the front side of the object that casts shadow (often a low-poly version) and extend the perimeter to form the volume which contains the shadow. Then you have several stages. First, you render entire scene with ambient lights only. Then you do your shadow volumes pass. You render shadow volumes to the index buffer twice. First you render front faces with +1, then back faces with -1. Anything within a shadow volume ends up having a +1. Anything outside of shadows ends up with 0. You can use that as a mask and render a lit version of the scene. Now things in the shadow are dark, and things outside of shadow are lit.
Naturally, shadow volumes are a version of occlusion testing. You just want occlusion from light's perspective, rather than camera's perspective.
As for GTA games, I'm not sure what optimizations IV and V do, but they seem to belong to family of more modern, dynamic object oriented engines. General approach with these is to do rendering in two stages. First stage is passing everything you are going to render to an optimization routine that does depth-sorting, frustum culling, and so on. All of this is done purely on CPU, and usually using only bounding box information or similar. That optimizer then starts to actually render things in correct order, culling things as necessary.
Older GTA games, like San Andreas, didn't bother with any of that, which is why there all sorts of transparency bugs in the game. What happens is that an object with alpha channel in texture gets rendered first. Only "visible" pixels make it to the color buffer, but all of the geometry, including transparent parts, get rendered to the depth buffer. (Because alpha-blending is only done on color, naturally.) As a result, any objects behind the transparent object get depth-culled. And you end up with things like looking through a bush or some railing, which is done with a low-poly object and transparent texture, and you can see through a wall behind it. It's a bug due to total lack of depth-sorting.
A final note on occlusion testing. Most of the time, depth buffer is all you need. There are some exceptions. Like I've mentioned above, if you are rendering an exterior scene and and interior one at the same time, it makes sense to render exterior first. And then, you want to do some occlusion optimization. Similarly, if you are doing order-dependent transparency, you need to start with distant objects first, and so occlusion optimization can help. But otherwise, a nearby object that got already rendered to the depth buffer prevents any of the distant pixels from being rendered in the same place. Again, you still have to do the geometry transforms (vertex shader pass), but that's cheap. What you don't want to do is have all of the remote pixels go through a pixel shader, and depth-culling happens before that.