OpenStreetMap integration
[edit]OpenStreetMap offers significantly more data that could enhance our trail guides, though it’s unclear what licensing or usage agreements would apply in these cases. Examples of useful data include:
- Distance (km) for each stage
- Elevation change and elevation profiles
- Estimated hiking time
- Complete trail mappings, including sub-relations
It’s worth noting that for most long-distance trails covered on Wikivoyage, the corresponding OpenStreetMap relations are already segmented in a way that aligns well with our guide format. The main exceptions are the Appalachian Trail and the GR 10, which are not divided into logical sections (for our purposes). Jpolvto (talk) 16:58, 4 May 2025 (UTC)
- The OSM licences are designed to make reuse possible, but I haven't checked their licensing terms in detail. Much of what we use are data collections rather than literature (maps are counted as such, at least in Finland), and data, I think, isn't protected in the USA. In EU there is the neighbouring database right protecting it, so we should respect the licensing terms (otherwise our EU contributors may have difficulties editing that aspect of our articles). Some uses require attribution; note the "OSM contributors" attribution included by default at the bottom of our dynamic maps. –LPfi (talk) 09:56, 5 May 2025 (UTC)
Scripts
[edit]For fetching data:
""" Download + cache the elevation profile for an OSM hiking super-route. The pickled result looks like: { "df": pandas.DataFrame, # elevation profile (dist_km / elev_m) "breaks": list[float], # stage-boundary distances, *including 0.0* "labels": list[str], # place-names, same length as breaks } USAGE python fetch_profile.py # default Lapplandsleden python fetch_profile.py -r 123456 -o mytrail.pkl """ import argparse import itertools import math import pickle import random import time from pathlib import Path import overpy import pandas as pd import requests from pyproj import Geod from shapely.geometry import LineString from shapely.ops import linemerge DENSIFY_SPACING_M = 50 DEM_ENDPOINT = "https://api.opentopodata.org/v1/mapzen" # global 90 m DEM_BATCH_LIMIT = 50 DELAY_BETWEEN_CALLS = 1.0 MAX_DEM_RETRIES = 4 BASE_BACKOFF_S = 2.5 DEFAULT_RELATION = 13439996 # Lapplandsleden super-route DEFAULT_OUTFILE = Path("lapplandsleden_profile.pkl") api = overpy.Overpass(max_retry_count=3, retry_timeout=30) geod = Geod(ellps="WGS84") def chunks(it, size): """Yield successive *size*-element chunks from *it*.""" it = iter(it) while (batch := list(itertools.islice(it, size))): yield batch def fetch_elevations(points): """Call the DEM API once for the provided shapely Points.""" locs = "|".join(f"{p.y},{p.x}" for p in points) url = f"{DEM_ENDPOINT}?locations={locs}" for attempt in range(MAX_DEM_RETRIES): r = requests.get(url, timeout=20) if r.ok and r.headers.get("content-type", "").startswith("application/json"): return [res.get("elevation") for res in r.json()["results"]] if r.status_code in (429, 503): # throttled / service busy back = BASE_BACKOFF_S * (1.7 ** attempt) + random.uniform(0, 0.5) print(f"[elev] {r.status_code} - sleeping {back:.1f}s") time.sleep(back) continue raise RuntimeError( f"DEM call failed {r.status_code}: {r.text[:120]!r}") raise RuntimeError("DEM retries exhausted") def stage_relations(super_id: int): """Return the stage relations that make up an OSM super-route, in their defined order.""" # 1. Fetch the super-relation itself to get its ordered list of members query_super_rel = f""" [out:json][timeout:120]; relation({super_id}); out body; // 'body' includes the relation's members in their defined order """ result_super_rel = api.query(query_super_rel) if not result_super_rel.relations: raise RuntimeError(f"Super-relation {super_id} not found.") super_relation = result_super_rel.relations[0] stage_rel_ids_ordered = [] for member in super_relation.members: # Use isinstance to check the type of the member object. # This assumes 'RelationRelation' is exposed by 'import overpy' as 'overpy.RelationRelation' # which is standard in recent versions. if isinstance(member, overpy.RelationRelation): # 'ref' attribute gives the ID of the member stage_rel_ids_ordered.append(member.ref) if not stage_rel_ids_ordered: raise RuntimeError( f"No stage relations found as members of the super-relation {super_id}.") # 2. Fetch the details (IDs and tags) for these stage relations # (The rest of the function remains the same as the previous corrected version) id_query_part = "".join( [f"relation({rel_id});" for rel_id in stage_rel_ids_ordered]) query_stages = f""" [out:json][timeout:120]; ( {id_query_part} ); out ids tags; """ result_stages = api.query(query_stages) rels_by_id = {rel.id: rel for rel in result_stages.relations} ordered_rels = [] for rel_id in stage_rel_ids_ordered: if rel_id in rels_by_id: ordered_rels.append(rels_by_id[rel_id]) else: print( f"[warning] Stage relation {rel_id} (member of {super_id}) was not found or couldn't be fetched.") if not ordered_rels: raise RuntimeError( f"Although {len(stage_rel_ids_ordered)} stage relation(s) were listed in super-relation {super_id}, none could be fetched.") return ordered_rels def relation_line(rel_id: int): """Fetch full geometry of a relation as a merged LineString.""" q = f""" [out:json][timeout:120]; relation({rel_id}); way(r)->.w; (.w;>;); out geom; """ ways = api.query(q).ways return linemerge([LineString([(n.lon, n.lat) for n in w.nodes]) for w in ways]) def densify(line, spacing=DENSIFY_SPACING_M): """Return evenly-spaced points along *line* (shapely LineString).""" length = geod.geometry_length(line) steps = max(1, math.ceil(length / spacing)) return [line.interpolate(t / steps, normalized=True) for t in range(steps + 1)] def build_profile(super_id: int): """Build the distance/elevation profile for a super-route relation.""" rows = [] breaks = [0.0] # start at 0 km labels = [] # will capture the first 'from' label here total_m = 0.0 for rel in stage_relations(super_id): sid = rel.id sfrom = rel.tags.get("from", "?") sto = rel.tags.get("to", "?") if not labels: # first loop = trail start labels.append(sfrom) # aligns with 0 km in breaks print(f"[stage] {sid}: {sfrom} → {sto}") pts = densify(relation_line(sid)) elev = [] for batch in chunks(pts, DEM_BATCH_LIMIT): elev.extend(fetch_elevations(batch)) time.sleep(DELAY_BETWEEN_CALLS) seg_m = 0.0 for i, (p, z) in enumerate(zip(pts, elev)): if i: seg_m += geod.inv(pts[i-1].x, pts[i-1].y, p.x, p.y)[2] rows.append({ "dist_km": (total_m + seg_m) / 1_000, "elev_m": z, }) total_m += geod.geometry_length(relation_line(sid)) breaks.append(total_m / 1_000) # arrival at stage finish labels.append(sto) df = (pd.DataFrame(rows) .infer_objects(copy=False) .apply(pd.to_numeric, errors="coerce")) df["elev_m"] = (df["elev_m"] .interpolate(limit_direction="both") .bfill().ffill() .round() .astype("Int64")) return { "df": df, "breaks": breaks, "labels": labels, } def main(): cli = argparse.ArgumentParser( description="Fetch & cache trail elevation profile.") cli.add_argument("-r", "--relation", type=int, default=DEFAULT_RELATION, help="OSM relation ID of the super-route") cli.add_argument("-o", "--outfile", type=Path, default=DEFAULT_OUTFILE, help="Where to store the pickled profile") args = cli.parse_args() bundle = build_profile(args.relation) with open(args.outfile, "wb") as f: pickle.dump(bundle, f) print(f"[saved] {args.outfile} ({bundle['df'].shape[0]:,} points)") if __name__ == "__main__": main()
For creating wikitext/matplotlib elevation profile:
""" Emit Wikivoyage-ready stage descriptions **and** save a Matplotlib elevation profile PNG (with vertical annotations). Design goals ============ * **Single frame:** Only one X-axis and one Y-axis, intersecting at the origin (0 km, 0 m). Top and right spines remain hidden. * **Numeric kilometre ticks preserved** on the bottom axis. * **Stage labels shown on a secondary (invisible) top axis** so they don't crowd the profile and numeric ticks coexist. * **No raster/grid** background. Run ``python plot_profile.py -h`` for CLI details. """ from __future__ import annotations import argparse import pickle from pathlib import Path from typing import List import matplotlib.pyplot as plt from matplotlib import ticker import pandas as pd from scipy.signal import savgol_filter # Constants DEFAULT_INFILE = Path("lapplandsleden_profile.pkl") DEFAULT_IMG = Path("profile.svg") HALF_CHAR = "½" # typographic half-symbol SMOOTH_WIN = 81 # global Savitzky-Golay window (odd) SMOOTH_ORDER = 2 # polynomial order for S-G filter # Formatting helpers def format_km(km: float, decimals: int = 1) -> str: return f"{km:.{decimals}f}".rstrip("0").rstrip(".") def format_hours(hours: float) -> str: hrs = round(hours * 2) / 2 # nearest 0.5 h whole = int(hrs) return f"{whole} h" if hrs % 1 == 0 else f"{whole}{HALF_CHAR} h" def duration_naismith(dist_km: float, gain_m: float) -> float: """Naismith rule: 4 km h⁻¹ on the flat + 600 m ascent h⁻¹.""" return dist_km / 4.0 + gain_m / 600.0 # Stage description generator def build_descriptions(df: pd.DataFrame, breaks: List[float], labels: List[str]) -> str: """Return Wikivoyage-style === PlaceA - PlaceB === blocks.""" start_dists = breaks[:-1] end_dists = breaks[1:] start_names = labels[:-1] end_names = labels[1:] thr = 1 # ignore ±1 m noise in cumulative gain/loss blocks: list[str] = [] for s_name, e_name, d0, d1 in zip(start_names, end_names, start_dists, end_dists): seg = df.loc[(df["dist_km"] >= d0) & ( df["dist_km"] <= d1)].reset_index(drop=True) dist_km = d1 - d0 diffs = seg["elev_smoothed"].diff().fillna(0) gain_m = int(round(diffs[diffs > thr].sum())) loss_m = int(round(-diffs[diffs < -thr].sum())) time_h = duration_naismith(dist_km, gain_m) blocks.append( f"=== {s_name} - {e_name} ===\n" f"; Distance : {format_km(dist_km)} km\n" f"; Duration : {format_hours(time_h)} hr\n" f"; Cumulative elevation gain : ↥ {gain_m} m\n" f"; Elevation loss : ↧ {loss_m} m\n" ) return "\n".join(blocks) # Matplotlib chart builder def make_matplotlib_chart( df: pd.DataFrame, breaks: List[float], labels: List[str], *, outfile: Path = DEFAULT_IMG, width_in: float = 8, height_in: float = 3, dpi: int = 150, km_tick: int | None = 25, # distance between numeric ticks ) -> Path: """Draw profile and save to *outfile*. - Bottom axis: numeric kilometre ticks. - Top (secondary) axis: stage labels (vertical text). - Single visible frame (left & bottom spines) at origin; no grid. """ fig, ax = plt.subplots(figsize=(width_in, height_in), dpi=dpi) # Profile line ax.plot(df["dist_km"], df["elev_smoothed"], linewidth=1.2) # Axes spines at origin ax.spines["left"].set_position("zero") ax.spines["bottom"].set_position("zero") ax.spines["right"].set_visible(False) ax.spines["top"].set_visible(False) ax.xaxis.set_ticks_position("bottom") ax.yaxis.set_ticks_position("left") # Numeric kilometre ticks if km_tick: ax.xaxis.set_major_locator(ticker.MultipleLocator(km_tick)) # Stage dividers for x in breaks: ax.axvline(x=x, color="grey", linestyle="--", linewidth=0.8) # Secondary x-axis for labels secax = ax.secondary_xaxis("top") secax.set_xticks(breaks) secax.set_xticklabels(labels, rotation=90, fontsize=8, ha="center", va="bottom") # Hide secondary spine to keep single-frame aesthetic secax.spines["top"].set_visible(False) secax.spines["right"].set_visible(False) secax.spines["left"].set_visible(False) # Axis limits & labels y_min, y_max = df["elev_smoothed"].min(), df["elev_smoothed"].max() + 10 ax.set_ylim(min(0, y_min), y_max) ax.set_xlim(0, max(df["dist_km"] + 10)) ax.set_xlabel("Distance (km)") ax.set_ylabel("Elevation (m)") # Layout tweaks fig.tight_layout() fig.subplots_adjust(top=0.60) # space for vertical stage labels fig.savefig(outfile, format="svg") plt.close(fig) return outfile def main() -> None: cli = argparse.ArgumentParser( description="Emit Wikivoyage stage descriptions and a PNG profile.") cli.add_argument("-i", "--infile", type=Path, default=DEFAULT_INFILE, help="Pickle produced by fetch_profile.py") cli.add_argument("-o", "--output", metavar="FILE.wiki", help="Write description blocks here (default: <infile>.wiki)") cli.add_argument("--img", metavar="FILE", type=Path, default=DEFAULT_IMG, help="Path for the rendered profile") cli.add_argument("--smooth-window", type=int, default=SMOOTH_WIN, help="Savitzky-Golay window length (odd)") cli.add_argument("--smooth-order", type=int, default=SMOOTH_ORDER, help="Savitzky-Golay polynomial order") cli.add_argument("--km-tick", type=int, default=25, help="Distance between numeric kilometre ticks (default 25)") args = cli.parse_args() # Validate pickle if not args.infile.exists(): raise SystemExit( f"Cache {args.infile} not found - run fetch_profile.py first.") with open(args.infile, "rb") as f: bundle = pickle.load(f) df: pd.DataFrame = bundle["df"] breaks: List[float] = bundle["breaks"] labels: List[str] = bundle["labels"] # Smooth elevation df = df.copy() df["elev_smoothed"] = savgol_filter( df["elev_m"].to_numpy(float), window_length=max(3, args.smooth_window | 1), # force odd polyorder=args.smooth_order, mode="nearest", ) # Outputs img_path = make_matplotlib_chart( df, breaks, labels, outfile=args.img, km_tick=args.km_tick) desc_path = Path( args.output) if args.output else args.infile.with_suffix(".wiki") desc_path.write_text(build_descriptions( df, breaks, labels) + "\n", encoding="utf-8") print(f"[saved] {desc_path}\n[saved] {img_path}") if __name__ == "__main__": main()
4720d877-0865-4bdc-99b8-1443ac9f93a6 (talk) 10:55, 9 May 2025 (UTC)
Listings
[edit]I’ve been thinking about listings for a while. While they’re a staple of traditional printed hiking guides, using them on Wikivoyage sometimes feels a bit off. One issue is duplication. When trails share sections, like the Kungsleden and Nordkalottleden, the same listings end up appearing multiple times.
Another challenge is how listings are presented. Personally, I find that placing full listings at the end of paragraphs can create information overload. For example, if you look at the history of the Offa’s Dyke Path article, it looked incredibly cluttered to me at one point.
I noticed it’s also possible to reference listings like this: [[Stockholm archipelago#Grand Hotel Saltsjöbaden|Hotel Saltsjöbaden]], which seems like a cleaner approach. But that raises another question.
I’m unsure where certain types of listings actually belong. Refuges and mountain huts often feel more connected to the trail than to any specific town or region. Still, I hesitate to place them directly within the itinerary sections either. Bluecoordinationfine (talk) 09:47, 8 July 2025 (UTC)
- I don't think the listings cause "information overload" in Nordkalottleden or Södra Kungsleden. On the contrary, I think it is nice to see also that information when reading about a specific leg. Of course, it is possible to cause clutter, and when there is a lot to tell about a place, it is usually best to just have a summary there and link the rest. In those cases, the place is often relevant also outside the specific itinerary and can be described in its own article, such as Ritsem or Kvikkjokk. I see that you moved out listings from the French Way; I haven't looked closer. Whether moving away the listings make sense depends on the specific itinerary and the style of the article.
- Avoiding having the same information duplicated across articles is a real concern. I'd recommend creating articles for hub locations, unless there is some existing article where the information fits. There was a discussion on overlapping sections somewhere, and I think I stated that covering such a section on a separate page would make sense in many cases. In the cases I have stumbled upon, there either has been some itinerary on (more or less) the specific section, there has been an article on just one of the overlapping trails, or the overlapping has been minor. There certainly are cases, where you'd need to create a page on the overlapping section with no rationale other than avoiding duplication. This mostly works if the trails that overlap are similar. If one is a hiking trail and the other a snowmobile route (sharing sights and huts), then referring readers to the other one doesn't feel right.
- –LPfi (talk) 13:10, 10 July 2025 (UTC)
- Hey! Thanks a lot for the reply!
- This is just a personal observation. It’s not that listings are objectively cluttered, but I do get a kind of information overload from listings with a large amount of details like phone numbers, email addresses, or capacity information. I think that's more of a general point on listings presentation though.
- I mostly moved them from the GR 10 article! I think Scandinavian wilderness trails are quite different from trails in Central Europe. For the GR 10, you start and end every day in a village, adding the listings there makes a lot more sense. Not so much for the trails in Scandinavia that you mentioned.
- This is also the reason I hesitate a bit there. It seems the listings belong more to the trail than to a town.
- I don't have a real answer to the question on duplication here, but it's maybe good to mention that this is something to think about in the future. Bluecoordinationfine (talk) 17:21, 10 July 2025 (UTC)
General improvements
[edit]Here are a few possible improvements I’ve been thinking about. Some are relatively straightforward, while others may be more technical:
- Link dynamic map sections to the corresponding text, so clicking a section on the map scrolls to the relevant paragraph below.
- Move listings into separate articles to reduce clutter and keep the trail description more focused.
- Create a structured element for section data (such as elevation gain/loss, duration, and distance), since this information is already structured. This could help maintain consistency in the future.