cover

Here you can get a free skin cancer screening in Hamburg even if you are younger than 35 years.

Background

This paragraph is mostly a rant.

In Germany you are obligated to a free skin cancer screening if you are 35 years or older, and you are under the statutory health insurance. If you reach that age, you can even pick your doctor. Since this can be too late, many insurances – included TK, which I am using – offers this service for younger people. However you have to pick a doctor that has a special agreement with said insurance.

So how do you find a doctor? A search on Doctolib results in over 100 results, and as we will later see, the coverage varies greatly depending on the insurance. Calling every one of them is certainly a waste of time. So I am using the TK Android app to contact their support and ask for a list or a recommendation. That turned out to be harder than expected:

  • auto-complete: The app disables auto-complete of your keyboard, so it takes a while to write that request. Their support told me that this is for “data protection” reasons, which is clearly nonsense (see flagNoPersonalizedLearning under android:imeOptions).
  • messages: If you receive a response – or any kind of message from TK – you get an email telling you that there is a message and that you should use their app to read it. If you open the app – which by the way does not support push notifications – then you find a message stating that you have a message, plus a PDF attachment. That PDF then finally contains the actual content.
  • no “advertisement”: They then told me that they cannot name a doctor because that would be “advertisement” and as an insurance, they are not allowed to do that.

At this point I am wondering if I am communicating with a Schildbürger.

After some internet search, I found the Kassenärztliche Vereinigung Hamburg and they offer lists of doctors that signed agreements. These lists however are in PDF form and it is rather cumbersome to find a doctor close to you that also has somewhat good reviews. So I did some magic to create some beautiful maps for me (and you).

Maps

The coverage depends on your health insurance. Also check your insurance for the exact terms and conditions. This data here might also be out of date and is provided without any guarantees whatsoever.

Barmer

BIG direkt

BKK Landesverband Nordwest

HEK

Knappschaft

TK

Technical Details

The entire processing was done in Python in a marimo notebook:

  1. Get PDFs from Kassenärztliche Vereinigung Hamburg using requests
  2. Parse tables in PDFs using Camelot
  3. Massage the inconsistent table into a consistent format using Pandas
  4. Retrieve geo location and user rating from Google Maps using the googlemaps library
  5. Store data in GeoJSON using GeoPandas
  6. Render map on this website
Code
# /// script
# requires-python = ">=3.12"
# dependencies = [
#     "camelot-py[ghostscript,plot]>=1.0.0",
#     "geopandas==1.1.0",
#     "googlemaps==4.10.0",
#     "marimo",
#     "pandas==2.3.0",
#     "python-dotenv==1.1.0",
#     "requests==2.32.4",
#     "tqdm==4.67.1",
# ]
# ///

import marimo

__generated_with = "0.14.0"
app = marimo.App()


@app.cell(hide_code=True)
def _(mo):
    mo.md(r"""# Find a Skin Doctor""")
    return


@app.cell(hide_code=True)
def _(mo):
    mo.md(r"""## Setup""")
    return


@app.cell
def _():
    import io
    import pathlib

    import geopandas as gpd
    import marimo as mo
    import pandas as pd

    import camelot
    import dotenv
    import googlemaps
    import requests
    from tqdm.notebook import tqdm
    return camelot, dotenv, googlemaps, gpd, io, mo, pd, requests, tqdm


@app.cell
def _(dotenv, mo):
    cfg = dotenv.dotenv_values(mo.notebook_dir() / ".env")
    return (cfg,)


@app.cell(hide_code=True)
def _(mo):
    mo.md(r"""## Load Data""")
    return


@app.cell
def _():
    urls = {
        "Barmer": "https://www.kvhh.net/_Resources/Persistent/4/b/3/b/4b3b919d83692af7e2323a3385bef7c76b9ead25/3-2024%20Teilnehmerverzeichnis%20HKS-Barmer.pdf",
        "BIG direkt": "https://www.kvhh.net/_Resources/Persistent/4/5/7/1/4571da229a8b0193b5443342fa9f99e2ac6be153/II-2024%20Teilnehmerverzeichnis%20TNHKS4.pdf",
        "BKK Landesverband Nordwest": "https://www.kvhh.net/_Resources/Persistent/c/e/9/d/ce9d9f821adf03b163952c284c29c87250b92d1a/TNHKS5%20Hautkrebsvorsorge%20Verfahren.pdf",
        "Knappschaft": "https://www.kvhh.net/_Resources/Persistent/7/d/c/f/7dcf3223c62ff85fa2b049fbc3d59e1f09d9382b/TNHKS6.pdf",
        "HEK": "https://www.kvhh.net/_Resources/Persistent/4/2/9/0/4290bbcab9b99fa1b1bf3e776efafc533634f431/TNHKS1.pdf",
        "TK": "https://www.kvhh.net/_Resources/Persistent/4/2/9/0/4290bbcab9b99fa1b1bf3e776efafc533634f431/TNHKS1.pdf",
    }
    return (urls,)


@app.cell
def _(requests, tqdm, urls):
    data = {}
    for _name, url in tqdm(urls.items()):
        resp = requests.get(url)
        resp.raise_for_status()
        data[_name] = resp.content
    return (data,)


@app.cell(hide_code=True)
def _(mo):
    mo.md(r"""## Load List""")
    return


@app.cell
def _(camelot, data, io, pd, tqdm):
    dfs = []
    for _name, pdf in tqdm(data.items()):
        tables = camelot.read_pdf(io.BytesIO(pdf), pages="all")
        df = pd.concat((table.df for table in tables), ignore_index=True)
        df = (
            df.rename(
                columns=df.iloc[0],
            )
            .drop(df.index[0])
            .reset_index(
                drop=True,
            )
            .rename(
                columns={
                    "Adresse": "Address",
                    "Adresse HE": "Address",
                    "Telefon inkl. Vorwahl": "Phone",
                    "Telefon, incl. Vorwahl": "Phone",
                    "Telefon,": "Phone",
                    "Telefonnummer": "Phone",
                    "Rufname": "Vorname",
                    "Strasse Hausnr.": "StrasseHausnummer",
                }
            )
        )
        if "Address" not in df.columns:
            df["Address"] = (
                df["Titel"]
                + " "
                + df["Vorname"]
                + " "
                + df["Nachname"]
                + ", "
                + df["StrasseHausnummer"]
                + ", "
                + df["PLZ"]
                + " "
                + df["Ort"]
            )
        df = df.loc[:, ["Address", "Phone"]]
        df["Insurance"] = _name
        dfs.append(df)
    df = pd.concat(dfs, ignore_index=True)
    return (df,)


@app.cell
def _(df):
    df
    return


@app.cell(hide_code=True)
def _(mo):
    mo.md(r"""## Geocoding""")
    return


@app.cell
def _(cfg, googlemaps):
    gmaps = googlemaps.Client(key=cfg["GOOGLE_API_KEY"])
    return (gmaps,)


@app.cell
def _(df, gmaps, pd, tqdm):
    key = []
    lat = []
    lng = []
    rating = []
    user_ratings_total = []

    for addr in tqdm(sorted(df["Address"].unique())):
        key.append(addr)

        results = gmaps.places(addr)["results"]
        if len(results) == 0:
            results = gmaps.geocode(addr)
        result = results[0]
        location = result["geometry"]["location"]
        lat.append(location["lat"])
        lng.append(location["lng"])

        place = gmaps.place(result["place_id"])["result"]
        rating.append(place.get("rating"))
        user_ratings_total.append(place.get("user_ratings_total"))

    df_lookup = pd.DataFrame(
        {
            "Address": key,
            "lat": lat,
            "lng": lng,
            "Rating": rating,
            "Number of User Ratings": user_ratings_total,
        }
    )
    return (df_lookup,)


@app.cell
def _(df, df_lookup):
    df_1 = (
        df.merge(df_lookup, on="Address")
        .sort_values(
            by=["Insurance", "Rating", "Number of User Ratings"], ascending=True
        )
        .reset_index(drop=True)
    )
    return (df_1,)


@app.cell(hide_code=True)
def _(mo):
    mo.md(r"""## Plot""")
    return


@app.cell
def _(df_1, gpd, mo, tqdm):
    for kasse in tqdm(sorted(df_1["Insurance"].unique())):
        df_sub = (
            df_1.loc[df_1["Insurance"] == kasse,]
            .reset_index(drop=True)
            .drop(columns=["Insurance"])
        )
        gdf = gpd.GeoDataFrame(
            df_sub,
            geometry=gpd.points_from_xy(df_sub.lng, df_sub.lat),
            crs="EPSG:4326",
        )
        with (mo.notebook_dir() / f"{kasse}.json").open("w") as fp:
            fp.write(
                gdf.to_json(
                    show_bbox=True,
                    to_wgs84=True,
                )
            )
        """
        m = gdf.explore(
            cmap="plasma",
            column="rating",
            marker_type="circle_marker",
            marker_kwds=dict(radius=5),
            popup=True,
            style_kwds=dict(fillOpacity=1, stroke=False),
        )
        m.save(mo.notebook_dir() / f"{kasse}.html")
        """
    return


@app.cell
def _(df_1):
    df_1
    return


if __name__ == "__main__":
    app.run()