fix(tui): preserve URL clickability across all TUI views (#12067)

## Problem

Long URLs containing `/` and `-` characters are split across multiple
terminal lines by `textwrap`'s default hyphenation rules. This breaks
terminal link detection: emulators can no longer identify the URL as
clickable, and copy-paste yields a truncated fragment. The issue affects
every view that renders user or agent text — exec output, history cells,
markdown, the app-link setup screen, and the VT100 scrollback path.

A secondary bug compounds the first: `desired_height()` calculations
count logical lines rather than viewport rows. When a URL overflows its
line and wraps visually, the height budget is too small, causing content
to clip or leave gaps.

Here is how the complete URL is interpreted by the terminal before
(first line only) and after (complete URL):

| Before | After |
|---|---|
| <img width="777" height="1002" alt="Screenshot 2026-02-17 at 7 59 11
PM"
src="https://github.com/user-attachments/assets/193a89a0-7e56-49c5-8b76-53499a76e7e3"
/> | <img width="777" height="1002" alt="Screenshot 2026-02-17 at 7 58
40 PM"
src="https://github.com/user-attachments/assets/0b9b4c14-aafb-439f-9ffe-f6bba556f95e"
/> |

## Mental model

The TUI now treats URL-like tokens as atomic units that must never be
split by the wrapping engine. Every call site that previously used
`word_wrap_*` has been migrated to `adaptive_wrap_*`, which inspects
each line for URL-like tokens and switches wrapping strategy
accordingly:

- **Non-URL lines** follow the existing `textwrap` path unchanged (word
boundaries, optional indentation, hyphenation).
- **URL-only lines** (with at most decorative markers like `│`, `-`,
`1.`) are emitted unwrapped so terminal link detection works; ratatui's
`Wrap { trim: false }` handles the final character wrap at render time.
- **Mixed lines** (URL + substantive non-URL prose) flow through
`adaptive_wrap_line` so prose wraps naturally at word boundaries while
URL tokens remain unsplit.

Height measurement everywhere now delegates to
`Paragraph::line_count(width)`, which accounts for the visual row cost
of overflowed lines. This single source of truth replaces ad-hoc line
counting in individual cells.

For terminal scrollback (the VT100 path that prints history when the TUI
exits), URL-only lines are emitted unwrapped so the terminal's own link
detector can find them. Mixed URL+prose lines use adaptive wrapping so
surrounding text wraps naturally. Continuation rows are pre-cleared to
avoid stale content artifacts.

## Non-goals

- Full RFC 3986 URL parsing. The detector is a conservative heuristic
that covers `scheme://host`, bare domains (`example.com/path`),
`localhost:port`, and IPv4 hosts. IPv6 (`[::1]:8080`) and exotic schemes
are intentionally excluded from v1.
- Changing wrapping behavior for non-URL content.
- Reflowing or reformatting existing terminal scrollback on resize.

## Tradeoffs

| Decision | Upside | Downside |
|----------|--------|----------|
| Heuristic URL detection vs. full parser | Fast, zero-alloc on the hot
path; conservative enough to reject file paths like `src/main.rs` |
False negatives on obscure URL formats (they get split as before) |
| Adaptive (three-path) wrapping | Non-URL lines are untouched — no
behavior change, no perf cost; mixed lines wrap prose naturally while
preserving URLs | Three wrapping strategies to reason about when
debugging layout |
| Row-based truncation with line-unit ellipsis | Accurate viewport
budget; stable "N lines omitted" count across terminal widths |
`truncate_lines_middle` is more complex (must compute per-line row cost)
|
| Unwrapped URL-only lines in scrollback | Terminal emulators detect
clickable links; copy-paste gets the full URL | TUI and scrollback
formatting diverge for URL-only lines |
| Default `desired_height` via `Paragraph::line_count` | DRY — most
cells inherit correct measurement | Cells with custom layout must
remember to override |

## Architecture

```mermaid
flowchart TD
    A["adaptive_wrap_*()"] --> B{"line_contains_url_like?"}
    B -- No URL tokens --> C["word_wrap_line<br/>(textwrap default)"]
    B -- Has URL tokens --> D{"mixed URL + prose?"}
    D -- "URL-only<br/>(+ decorative markers)" --> E["emit unwrapped<br/>(terminal char-wraps)"]
    D -- "Mixed<br/>(URL + substantive text)" --> F["adaptive_wrap_line<br/>(AsciiSpace + custom WordSplitter)"]
    C --> G["Paragraph::line_count(w)<br/>(single height truth)"]
    E --> G
    F --> G
```

**Changed files:**

| File | Role |
|------|------|
| `wrapping.rs` | URL detection heuristics, mixed-line detection,
`adaptive_wrap_*` functions, custom `WordSplitter` |
| `exec_cell/render.rs` | Row-aware `truncate_lines_middle`, adaptive
wrapping for command/output display |
| `history_cell.rs` | Migrate all cell types to `adaptive_wrap_*`;
default `desired_height` via `Paragraph::line_count` |
| `insert_history.rs` | Three-path scrollback wrapping (unwrapped
URL-only, adaptive mixed, word-wrapped text); continuation row clearing
|
| `app_link_view.rs` | Adaptive wrapping for setup URL; `desired_height`
via `Paragraph::line_count` |
| `markdown_render.rs` | Adaptive wrapping in `finish_paragraph` |
| `model_migration.rs` | Viewport-aware wrapping for narrow-pane
markdown |
| `pager_overlay.rs` | `Wrap { trim: false }` for transcript and
streaming chunks |
| `queued_user_messages.rs` | Migrate to `adaptive_wrap_lines` |
| `status/card.rs` | Migrate to `adaptive_wrap_lines` |

## Observability

- **Ellipsis message** in truncated exec output reports omitted count in
logical lines (stable across resize) rather than viewport rows
(fluctuates).
- URL detection is deterministic and stateless — no hidden caching or
memoization to go stale.
- Height mismatch bugs surface immediately as visual clipping or gaps;
the `Paragraph::line_count` path is the same code ratatui uses at render
time, so measurement and rendering cannot diverge.

## Tests

26 new unit tests across 7 files, covering:

- **URL integrity**: assert a URL-like token appears on exactly one
rendered line (not split across two).
- **Height accuracy**: compare `desired_height()` against
`Paragraph::line_count()` for URL-containing content.
- **Row-aware truncation**: verify ellipsis counts logical lines and
output fits within the row budget.
- **Scrollback rendering**: VT100 backend tests confirm prefix and URL
land on the same row; continuation rows are cleared; mixed URL+prose
lines wrap prose while preserving URL tokens.
- **Mixed URL+prose detection**: `line_has_mixed_url_and_non_url_tokens`
correctly distinguishes lines with substantive non-URL text from lines
with only decorative markers alongside a URL.
- **Heuristic correctness**: positive matches (`https://...`,
`example.com/path`, `localhost:3000/api`, `192.168.1.1:8080/health`) and
negative matches (`src/main.rs`, `foo/bar`, `hello-world`).

## Risks and open items

1. **URL-like tokens in code output** (e.g. `example.com/api` inside a
JSON blob) will trigger URL-preserving wrap on that line. This is
acceptable — the worst case is a slightly wider line, not broken output.
2. **Very long non-URL tokens on a URL line** can only break at
character boundaries (the custom splitter emits all char indices for
non-URL words). On extremely narrow terminals this could overflow, but
narrow terminals already degrade gracefully.
3. **No IPv6 support** — `[::1]:8080/path` will be treated as a non-URL
and may get split. Can be added later without API changes.

Fixes #5457
This commit is contained in:
Felipe Coury
2026-02-21 20:31:41 -03:00
committed by GitHub
parent 66d5d34e6e
commit 2ba2c57af4
14 changed files with 1839 additions and 177 deletions

View File

@@ -2,9 +2,16 @@ use std::fmt;
use std::io;
use std::io::Write;
use crate::wrapping::word_wrap_lines_borrowed;
use crate::wrapping::RtOptions;
use crate::wrapping::adaptive_wrap_line;
use crate::wrapping::line_contains_url_like;
use crate::wrapping::line_has_mixed_url_and_non_url_tokens;
use crossterm::Command;
use crossterm::cursor::MoveDown;
use crossterm::cursor::MoveTo;
use crossterm::cursor::MoveToColumn;
use crossterm::cursor::RestorePosition;
use crossterm::cursor::SavePosition;
use crossterm::queue;
use crossterm::style::Color as CColor;
use crossterm::style::Colors;
@@ -38,10 +45,35 @@ where
let last_cursor_pos = terminal.last_known_cursor_pos;
let writer = terminal.backend_mut();
// Pre-wrap lines using word-aware wrapping so terminal scrollback sees the same
// formatting as the TUI. This avoids character-level hard wrapping by the terminal.
let wrapped = word_wrap_lines_borrowed(&lines, area.width.max(1) as usize);
let wrapped_lines = wrapped.len() as u16;
// Pre-wrap lines for terminal scrollback. Three paths:
//
// - URL-only-ish lines are kept intact (no hard newlines inserted) so that
// terminal emulators can match them as clickable links. The
// terminal will character-wrap these lines at the viewport
// boundary.
// - Mixed lines (URL + non-URL prose) are adaptively wrapped so
// non-URL text still wraps naturally while URL tokens remain
// unsplit.
// - Non-URL lines also flow through adaptive wrapping; behavior is
// equivalent to standard wrapping when no URL is present.
let wrap_width = area.width.max(1) as usize;
let mut wrapped = Vec::new();
let mut wrapped_rows = 0usize;
for line in &lines {
let line_wrapped =
if line_contains_url_like(line) && !line_has_mixed_url_and_non_url_tokens(line) {
vec![line.clone()]
} else {
adaptive_wrap_line(line, RtOptions::new(wrap_width))
};
wrapped_rows += line_wrapped
.iter()
.map(|wrapped_line| wrapped_line.width().max(1).div_ceil(wrap_width))
.sum::<usize>();
wrapped.extend(line_wrapped);
}
let wrapped_lines = wrapped_rows as u16;
let cursor_top = if area.bottom() < screen_size.height {
// If the viewport is not at the bottom of the screen, scroll it down to make room.
// Don't scroll it past the bottom of the screen.
@@ -94,6 +126,18 @@ where
for line in wrapped {
queue!(writer, Print("\r\n"))?;
// URL lines can be wider than the terminal and will
// character-wrap onto continuation rows. Pre-clear those rows
// so stale content from a previously longer line is erased.
let physical_rows = line.width().max(1).div_ceil(wrap_width);
if physical_rows > 1 {
queue!(writer, SavePosition)?;
for _ in 1..physical_rows {
queue!(writer, MoveDown(1), MoveToColumn(0))?;
queue!(writer, Clear(ClearType::UntilNewLine))?;
}
queue!(writer, RestorePosition)?;
}
queue!(
writer,
SetColors(Colors::new(
@@ -527,4 +571,163 @@ mod tests {
);
}
}
#[test]
fn vt100_prefixed_url_keeps_prefix_and_url_on_same_row() {
let width: u16 = 48;
let height: u16 = 8;
let backend = VT100Backend::new(width, height);
let mut term = crate::custom_terminal::Terminal::with_options(backend).expect("terminal");
let viewport = Rect::new(0, height - 1, width, 1);
term.set_viewport_area(viewport);
let url = "http://a-long-url.com/this/that/blablablab/new.aspx/many_people_like_how";
let line: Line<'static> = Line::from(vec!["".into(), url.into()]);
insert_history_lines(&mut term, vec![line]).expect("insert history");
let rows: Vec<String> = term.backend().vt100().screen().rows(0, width).collect();
assert!(
rows.iter().any(|r| r.contains("│ http://a-long-url.com")),
"expected prefix and URL on same row, rows: {rows:?}"
);
assert!(
!rows.iter().any(|r| r.trim_end() == ""),
"unexpected orphan prefix row, rows: {rows:?}"
);
}
#[test]
fn vt100_prefixed_url_like_without_scheme_keeps_prefix_and_token_on_same_row() {
let width: u16 = 48;
let height: u16 = 8;
let backend = VT100Backend::new(width, height);
let mut term = crate::custom_terminal::Terminal::with_options(backend).expect("terminal");
let viewport = Rect::new(0, height - 1, width, 1);
term.set_viewport_area(viewport);
let url_like =
"example.test/api/v1/projects/alpha-team/releases/2026-02-17/builds/1234567890";
let line: Line<'static> = Line::from(vec!["".into(), url_like.into()]);
insert_history_lines(&mut term, vec![line]).expect("insert history");
let rows: Vec<String> = term.backend().vt100().screen().rows(0, width).collect();
assert!(
rows.iter()
.any(|r| r.contains("│ example.test/api/v1/projects")),
"expected prefix and URL-like token on same row, rows: {rows:?}"
);
assert!(
!rows.iter().any(|r| r.trim_end() == ""),
"unexpected orphan prefix row, rows: {rows:?}"
);
}
#[test]
fn vt100_prefixed_mixed_url_line_wraps_suffix_words_together() {
let width: u16 = 24;
let height: u16 = 10;
let backend = VT100Backend::new(width, height);
let mut term = crate::custom_terminal::Terminal::with_options(backend).expect("terminal");
let viewport = Rect::new(0, height - 1, width, 1);
term.set_viewport_area(viewport);
let url = "https://example.test/path/abcdef12345";
let line: Line<'static> = Line::from(vec![
"".into(),
"see ".into(),
url.into(),
" tail words".into(),
]);
insert_history_lines(&mut term, vec![line]).expect("insert mixed history");
let rows: Vec<String> = term.backend().vt100().screen().rows(0, width).collect();
assert!(
rows.iter().any(|r| r.contains("│ see")),
"expected prefixed prose before URL, rows: {rows:?}"
);
assert!(
rows.iter().any(|r| r.contains("tail words")),
"expected suffix words to wrap as a phrase, rows: {rows:?}"
);
}
#[test]
fn vt100_unwrapped_url_like_clears_continuation_rows() {
let width: u16 = 20;
let height: u16 = 10;
let backend = VT100Backend::new(width, height);
let mut term = crate::custom_terminal::Terminal::with_options(backend).expect("terminal");
let viewport = Rect::new(0, height - 1, width, 1);
term.set_viewport_area(viewport);
let filler_line: Line<'static> = Line::from(vec![
"".into(),
"XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX".into(),
]);
insert_history_lines(&mut term, vec![filler_line]).expect("insert filler history");
let url_like = "example.test/api/v1/short";
let url_line: Line<'static> = Line::from(vec!["".into(), url_like.into()]);
insert_history_lines(&mut term, vec![url_line]).expect("insert url-like history");
let rows: Vec<String> = term.backend().vt100().screen().rows(0, width).collect();
let first_row = rows
.iter()
.position(|row| row.contains("│ example.test/api"))
.unwrap_or_else(|| panic!("expected url-like first row in screen rows: {rows:?}"));
assert!(
first_row + 1 < rows.len(),
"expected a continuation row for wrapped URL-like line, rows: {rows:?}"
);
let continuation_row = rows[first_row + 1].trim_end();
assert!(
continuation_row.contains("/v1/short") || continuation_row.contains("short"),
"expected continuation row to contain wrapped URL-like tail, got: {continuation_row:?}"
);
assert!(
!continuation_row.contains('X'),
"expected continuation row to be cleared before writing wrapped URL-like content, got: {continuation_row:?}"
);
}
#[test]
fn vt100_long_unwrapped_url_does_not_insert_extra_blank_gap_before_content() {
let width: u16 = 56;
let height: u16 = 24;
let backend = VT100Backend::new(width, height);
let mut term = crate::custom_terminal::Terminal::with_options(backend).expect("terminal");
let viewport = Rect::new(0, height - 1, width, 1);
term.set_viewport_area(viewport);
let prompt = "Write a long URL as output for testing";
insert_history_lines(&mut term, vec![Line::from(prompt)]).expect("insert prompt line");
let long_url = format!(
"https://example.test/api/v1/projects/alpha-team/releases/2026-02-17/builds/1234567890/{}",
"very-long-segment-".repeat(16),
);
let url_line: Line<'static> = Line::from(vec!["".into(), long_url.into()]);
insert_history_lines(&mut term, vec![url_line]).expect("insert long url line");
let rows: Vec<String> = term.backend().vt100().screen().rows(0, width).collect();
let prompt_row = rows
.iter()
.position(|row| row.contains("Write a long URL as output for testing"))
.unwrap_or_else(|| panic!("expected prompt row in screen rows: {rows:?}"));
let url_row = rows
.iter()
.position(|row| row.contains("• https://example.test/api"))
.unwrap_or_else(|| panic!("expected URL first row in screen rows: {rows:?}"));
assert!(
url_row <= prompt_row + 2,
"expected URL content to appear immediately after prompt (allowing at most one spacer row), got prompt_row={prompt_row}, url_row={url_row}, rows={rows:?}",
);
}
}