mirror of
https://github.com/openai/codex.git
synced 2026-05-04 05:11:37 +03:00
fix(tui): preserve URL clickability across all TUI views (#12067)
## Problem Long URLs containing `/` and `-` characters are split across multiple terminal lines by `textwrap`'s default hyphenation rules. This breaks terminal link detection: emulators can no longer identify the URL as clickable, and copy-paste yields a truncated fragment. The issue affects every view that renders user or agent text — exec output, history cells, markdown, the app-link setup screen, and the VT100 scrollback path. A secondary bug compounds the first: `desired_height()` calculations count logical lines rather than viewport rows. When a URL overflows its line and wraps visually, the height budget is too small, causing content to clip or leave gaps. Here is how the complete URL is interpreted by the terminal before (first line only) and after (complete URL): | Before | After | |---|---| | <img width="777" height="1002" alt="Screenshot 2026-02-17 at 7 59 11 PM" src="https://github.com/user-attachments/assets/193a89a0-7e56-49c5-8b76-53499a76e7e3" /> | <img width="777" height="1002" alt="Screenshot 2026-02-17 at 7 58 40 PM" src="https://github.com/user-attachments/assets/0b9b4c14-aafb-439f-9ffe-f6bba556f95e" /> | ## Mental model The TUI now treats URL-like tokens as atomic units that must never be split by the wrapping engine. Every call site that previously used `word_wrap_*` has been migrated to `adaptive_wrap_*`, which inspects each line for URL-like tokens and switches wrapping strategy accordingly: - **Non-URL lines** follow the existing `textwrap` path unchanged (word boundaries, optional indentation, hyphenation). - **URL-only lines** (with at most decorative markers like `│`, `-`, `1.`) are emitted unwrapped so terminal link detection works; ratatui's `Wrap { trim: false }` handles the final character wrap at render time. - **Mixed lines** (URL + substantive non-URL prose) flow through `adaptive_wrap_line` so prose wraps naturally at word boundaries while URL tokens remain unsplit. Height measurement everywhere now delegates to `Paragraph::line_count(width)`, which accounts for the visual row cost of overflowed lines. This single source of truth replaces ad-hoc line counting in individual cells. For terminal scrollback (the VT100 path that prints history when the TUI exits), URL-only lines are emitted unwrapped so the terminal's own link detector can find them. Mixed URL+prose lines use adaptive wrapping so surrounding text wraps naturally. Continuation rows are pre-cleared to avoid stale content artifacts. ## Non-goals - Full RFC 3986 URL parsing. The detector is a conservative heuristic that covers `scheme://host`, bare domains (`example.com/path`), `localhost:port`, and IPv4 hosts. IPv6 (`[::1]:8080`) and exotic schemes are intentionally excluded from v1. - Changing wrapping behavior for non-URL content. - Reflowing or reformatting existing terminal scrollback on resize. ## Tradeoffs | Decision | Upside | Downside | |----------|--------|----------| | Heuristic URL detection vs. full parser | Fast, zero-alloc on the hot path; conservative enough to reject file paths like `src/main.rs` | False negatives on obscure URL formats (they get split as before) | | Adaptive (three-path) wrapping | Non-URL lines are untouched — no behavior change, no perf cost; mixed lines wrap prose naturally while preserving URLs | Three wrapping strategies to reason about when debugging layout | | Row-based truncation with line-unit ellipsis | Accurate viewport budget; stable "N lines omitted" count across terminal widths | `truncate_lines_middle` is more complex (must compute per-line row cost) | | Unwrapped URL-only lines in scrollback | Terminal emulators detect clickable links; copy-paste gets the full URL | TUI and scrollback formatting diverge for URL-only lines | | Default `desired_height` via `Paragraph::line_count` | DRY — most cells inherit correct measurement | Cells with custom layout must remember to override | ## Architecture ```mermaid flowchart TD A["adaptive_wrap_*()"] --> B{"line_contains_url_like?"} B -- No URL tokens --> C["word_wrap_line<br/>(textwrap default)"] B -- Has URL tokens --> D{"mixed URL + prose?"} D -- "URL-only<br/>(+ decorative markers)" --> E["emit unwrapped<br/>(terminal char-wraps)"] D -- "Mixed<br/>(URL + substantive text)" --> F["adaptive_wrap_line<br/>(AsciiSpace + custom WordSplitter)"] C --> G["Paragraph::line_count(w)<br/>(single height truth)"] E --> G F --> G ``` **Changed files:** | File | Role | |------|------| | `wrapping.rs` | URL detection heuristics, mixed-line detection, `adaptive_wrap_*` functions, custom `WordSplitter` | | `exec_cell/render.rs` | Row-aware `truncate_lines_middle`, adaptive wrapping for command/output display | | `history_cell.rs` | Migrate all cell types to `adaptive_wrap_*`; default `desired_height` via `Paragraph::line_count` | | `insert_history.rs` | Three-path scrollback wrapping (unwrapped URL-only, adaptive mixed, word-wrapped text); continuation row clearing | | `app_link_view.rs` | Adaptive wrapping for setup URL; `desired_height` via `Paragraph::line_count` | | `markdown_render.rs` | Adaptive wrapping in `finish_paragraph` | | `model_migration.rs` | Viewport-aware wrapping for narrow-pane markdown | | `pager_overlay.rs` | `Wrap { trim: false }` for transcript and streaming chunks | | `queued_user_messages.rs` | Migrate to `adaptive_wrap_lines` | | `status/card.rs` | Migrate to `adaptive_wrap_lines` | ## Observability - **Ellipsis message** in truncated exec output reports omitted count in logical lines (stable across resize) rather than viewport rows (fluctuates). - URL detection is deterministic and stateless — no hidden caching or memoization to go stale. - Height mismatch bugs surface immediately as visual clipping or gaps; the `Paragraph::line_count` path is the same code ratatui uses at render time, so measurement and rendering cannot diverge. ## Tests 26 new unit tests across 7 files, covering: - **URL integrity**: assert a URL-like token appears on exactly one rendered line (not split across two). - **Height accuracy**: compare `desired_height()` against `Paragraph::line_count()` for URL-containing content. - **Row-aware truncation**: verify ellipsis counts logical lines and output fits within the row budget. - **Scrollback rendering**: VT100 backend tests confirm prefix and URL land on the same row; continuation rows are cleared; mixed URL+prose lines wrap prose while preserving URL tokens. - **Mixed URL+prose detection**: `line_has_mixed_url_and_non_url_tokens` correctly distinguishes lines with substantive non-URL text from lines with only decorative markers alongside a URL. - **Heuristic correctness**: positive matches (`https://...`, `example.com/path`, `localhost:3000/api`, `192.168.1.1:8080/health`) and negative matches (`src/main.rs`, `foo/bar`, `hello-world`). ## Risks and open items 1. **URL-like tokens in code output** (e.g. `example.com/api` inside a JSON blob) will trigger URL-preserving wrap on that line. This is acceptable — the worst case is a slightly wider line, not broken output. 2. **Very long non-URL tokens on a URL line** can only break at character boundaries (the custom splitter emits all char indices for non-URL words). On extremely narrow terminals this could overflow, but narrow terminals already degrade gracefully. 3. **No IPv6 support** — `[::1]:8080/path` will be treated as a non-URL and may get split. Can be added later without API changes. Fixes #5457
This commit is contained in:
@@ -1,3 +1,31 @@
|
||||
//! Word-wrapping with URL-aware heuristics.
|
||||
//!
|
||||
//! The TUI renders text that frequently contains URLs — command output,
|
||||
//! markdown, agent messages, tool-call results. Standard `textwrap`
|
||||
//! hyphenation treats `/` and `-` as split points, which breaks URLs
|
||||
//! across lines and makes them unclickable in terminal emulators.
|
||||
//!
|
||||
//! This module provides two wrapping paths:
|
||||
//!
|
||||
//! - **Standard** (`word_wrap_line`, `word_wrap_lines`): delegates to
|
||||
//! `textwrap` with the caller's options unchanged. Used when the
|
||||
//! content is known to be plain prose.
|
||||
//! - **Adaptive** (`adaptive_wrap_line`, `adaptive_wrap_lines`):
|
||||
//! inspects the line for URL-like tokens; if any are found, the
|
||||
//! wrapping switches to `AsciiSpace` word separation and a custom
|
||||
//! `WordSplitter` that refuses to split URL tokens. Non-URL tokens
|
||||
//! on the same line still break at every character boundary (the
|
||||
//! custom splitter returns all char indices for non-URL words).
|
||||
//!
|
||||
//! Callers that *might* encounter URLs should use the `adaptive_*`
|
||||
//! functions. Callers that definitely will not (code blocks, pure
|
||||
//! numeric output) can use the standard path for speed.
|
||||
//!
|
||||
//! URL detection is heuristic — see [`text_contains_url_like`] for the
|
||||
//! rules. False positives suppress hyphenation for that line; false
|
||||
//! negatives let a URL get split. The heuristic is intentionally
|
||||
//! conservative: file paths like `src/main.rs` are not matched.
|
||||
|
||||
use ratatui::text::Line;
|
||||
use ratatui::text::Span;
|
||||
use std::borrow::Cow;
|
||||
@@ -6,12 +34,16 @@ use textwrap::Options;
|
||||
|
||||
use crate::render::line_utils::push_owned_lines;
|
||||
|
||||
/// Returns byte-ranges into `text` for each wrapped line, including
|
||||
/// trailing whitespace and a +1 sentinel byte. Used by the textarea
|
||||
/// cursor-position logic.
|
||||
pub(crate) fn wrap_ranges<'a, O>(text: &str, width_or_options: O) -> Vec<Range<usize>>
|
||||
where
|
||||
O: Into<Options<'a>>,
|
||||
{
|
||||
let opts = width_or_options.into();
|
||||
let mut lines: Vec<Range<usize>> = Vec::new();
|
||||
let mut cursor = 0usize;
|
||||
for line in textwrap::wrap(text, opts).iter() {
|
||||
match line {
|
||||
std::borrow::Cow::Borrowed(slice) => {
|
||||
@@ -19,8 +51,14 @@ where
|
||||
let end = start + slice.len();
|
||||
let trailing_spaces = text[end..].chars().take_while(|c| *c == ' ').count();
|
||||
lines.push(start..end + trailing_spaces + 1);
|
||||
cursor = end + trailing_spaces;
|
||||
}
|
||||
std::borrow::Cow::Owned(slice) => {
|
||||
let mapped = map_owned_wrapped_line_to_range(text, cursor, slice);
|
||||
let trailing_spaces = text[mapped.end..].chars().take_while(|c| *c == ' ').count();
|
||||
lines.push(mapped.start..mapped.end + trailing_spaces + 1);
|
||||
cursor = mapped.end + trailing_spaces;
|
||||
}
|
||||
std::borrow::Cow::Owned(_) => panic!("wrap_ranges: unexpected owned string"),
|
||||
}
|
||||
}
|
||||
lines
|
||||
@@ -35,19 +73,429 @@ where
|
||||
{
|
||||
let opts = width_or_options.into();
|
||||
let mut lines: Vec<Range<usize>> = Vec::new();
|
||||
let mut cursor = 0usize;
|
||||
for line in textwrap::wrap(text, opts).iter() {
|
||||
match line {
|
||||
std::borrow::Cow::Borrowed(slice) => {
|
||||
let start = unsafe { slice.as_ptr().offset_from(text.as_ptr()) as usize };
|
||||
let end = start + slice.len();
|
||||
lines.push(start..end);
|
||||
cursor = end;
|
||||
}
|
||||
std::borrow::Cow::Owned(slice) => {
|
||||
let mapped = map_owned_wrapped_line_to_range(text, cursor, slice);
|
||||
lines.push(mapped.clone());
|
||||
cursor = mapped.end;
|
||||
}
|
||||
std::borrow::Cow::Owned(_) => panic!("wrap_ranges_trim: unexpected owned string"),
|
||||
}
|
||||
}
|
||||
lines
|
||||
}
|
||||
|
||||
/// Maps an owned (materialized) wrapped line back to a byte range in `text`.
|
||||
///
|
||||
/// `textwrap` returns `Cow::Owned` when it inserts a hyphenation penalty
|
||||
/// character (typically `-`) that does not exist in the source. This
|
||||
/// function walks the owned string character-by-character against the
|
||||
/// source, skipping trailing penalty chars, and returns the
|
||||
/// corresponding source byte range starting from `cursor`.
|
||||
fn map_owned_wrapped_line_to_range(text: &str, cursor: usize, wrapped: &str) -> Range<usize> {
|
||||
let mut start = cursor;
|
||||
while start < text.len() && !wrapped.starts_with(' ') {
|
||||
let Some(ch) = text[start..].chars().next() else {
|
||||
break;
|
||||
};
|
||||
if ch != ' ' {
|
||||
break;
|
||||
}
|
||||
start += ch.len_utf8();
|
||||
}
|
||||
|
||||
let mut end = start;
|
||||
let mut chars = wrapped.chars().peekable();
|
||||
while let Some(ch) = chars.next() {
|
||||
if end < text.len() {
|
||||
let Some(src) = text[end..].chars().next() else {
|
||||
unreachable!("checked end < text.len()");
|
||||
};
|
||||
if ch == src {
|
||||
end += src.len_utf8();
|
||||
continue;
|
||||
}
|
||||
}
|
||||
|
||||
// textwrap can materialize owned lines when penalties are inserted.
|
||||
// The default penalty is a trailing '-'; it does not correspond to
|
||||
// source bytes, so we skip it while keeping byte ranges in source text.
|
||||
if ch == '-' && chars.peek().is_none() {
|
||||
continue;
|
||||
}
|
||||
|
||||
panic!("wrap_ranges: could not map owned line {wrapped:?} to source near byte {cursor}");
|
||||
}
|
||||
|
||||
start..end
|
||||
}
|
||||
|
||||
/// Returns `true` if any whitespace-delimited token in `line` looks like a URL.
|
||||
///
|
||||
/// Concatenates all span contents and delegates to [`text_contains_url_like`].
|
||||
pub(crate) fn line_contains_url_like(line: &Line<'_>) -> bool {
|
||||
let text: String = line
|
||||
.spans
|
||||
.iter()
|
||||
.map(|span| span.content.as_ref())
|
||||
.collect();
|
||||
text_contains_url_like(&text)
|
||||
}
|
||||
|
||||
/// Returns `true` if `line` contains both a URL-like token and at least one
|
||||
/// substantive non-URL token.
|
||||
///
|
||||
/// Decorative marker tokens (for example list prefixes like `-`, `1.`, `|`,
|
||||
/// `│`) are ignored for the non-URL side of this check.
|
||||
pub(crate) fn line_has_mixed_url_and_non_url_tokens(line: &Line<'_>) -> bool {
|
||||
let text: String = line
|
||||
.spans
|
||||
.iter()
|
||||
.map(|span| span.content.as_ref())
|
||||
.collect();
|
||||
text_has_mixed_url_and_non_url_tokens(&text)
|
||||
}
|
||||
|
||||
/// Returns `true` if any whitespace-delimited token in `text` looks like a URL.
|
||||
///
|
||||
/// Recognized patterns:
|
||||
/// - Absolute URLs with a scheme (`https://…`, `ftp://…`, custom `myapp://…`).
|
||||
/// - Bare domain URLs (`example.com/path`, `www.example.com`, `localhost:3000/api`).
|
||||
/// - IPv4 hosts with a path (`192.168.1.1:8080/health`).
|
||||
///
|
||||
/// Surrounding punctuation (`()[]{}< >,.;:!'"`) is stripped before
|
||||
/// checking. Tokens that look like file paths (`src/main.rs`, `foo/bar`)
|
||||
/// are intentionally rejected — the host portion must be a valid domain
|
||||
/// name (with a recognized TLD), an IPv4 address, or `localhost`.
|
||||
pub(crate) fn text_contains_url_like(text: &str) -> bool {
|
||||
text.split_ascii_whitespace().any(is_url_like_token)
|
||||
}
|
||||
|
||||
/// Returns `true` if `text` contains at least one URL-like token and at least
|
||||
/// one substantive non-URL token.
|
||||
fn text_has_mixed_url_and_non_url_tokens(text: &str) -> bool {
|
||||
let mut saw_url = false;
|
||||
let mut saw_non_url = false;
|
||||
|
||||
for raw_token in text.split_ascii_whitespace() {
|
||||
if is_url_like_token(raw_token) {
|
||||
saw_url = true;
|
||||
} else if is_substantive_non_url_token(raw_token) {
|
||||
saw_non_url = true;
|
||||
}
|
||||
|
||||
if saw_url && saw_non_url {
|
||||
return true;
|
||||
}
|
||||
}
|
||||
|
||||
false
|
||||
}
|
||||
|
||||
/// Decides whether a single whitespace-delimited token is URL-like.
|
||||
///
|
||||
/// Strips surrounding punctuation, then checks for an absolute URL
|
||||
/// (with `://`) or a bare domain URL (recognized host + path/query/fragment).
|
||||
fn is_url_like_token(raw_token: &str) -> bool {
|
||||
let token = trim_url_token(raw_token);
|
||||
!token.is_empty() && (is_absolute_url_like(token) || is_bare_url_like(token))
|
||||
}
|
||||
|
||||
fn is_substantive_non_url_token(raw_token: &str) -> bool {
|
||||
let token = trim_url_token(raw_token);
|
||||
if token.is_empty() || is_decorative_marker_token(raw_token, token) {
|
||||
return false;
|
||||
}
|
||||
|
||||
token.chars().any(char::is_alphanumeric)
|
||||
}
|
||||
|
||||
fn is_decorative_marker_token(raw_token: &str, token: &str) -> bool {
|
||||
let raw = raw_token.trim();
|
||||
matches!(
|
||||
raw,
|
||||
"-" | "*"
|
||||
| "+"
|
||||
| "•"
|
||||
| "◦"
|
||||
| "▪"
|
||||
| ">"
|
||||
| "|"
|
||||
| "│"
|
||||
| "┆"
|
||||
| "└"
|
||||
| "├"
|
||||
| "┌"
|
||||
| "┐"
|
||||
| "┘"
|
||||
| "┼"
|
||||
) || is_ordered_list_marker(raw, token)
|
||||
}
|
||||
|
||||
fn is_ordered_list_marker(raw_token: &str, token: &str) -> bool {
|
||||
token.chars().all(|c| c.is_ascii_digit())
|
||||
&& (raw_token.ends_with('.') || raw_token.ends_with(')'))
|
||||
}
|
||||
|
||||
fn trim_url_token(token: &str) -> &str {
|
||||
token.trim_matches(|c: char| {
|
||||
matches!(
|
||||
c,
|
||||
'(' | ')'
|
||||
| '['
|
||||
| ']'
|
||||
| '{'
|
||||
| '}'
|
||||
| '<'
|
||||
| '>'
|
||||
| ','
|
||||
| '.'
|
||||
| ';'
|
||||
| ':'
|
||||
| '!'
|
||||
| '\''
|
||||
| '"'
|
||||
)
|
||||
})
|
||||
}
|
||||
|
||||
/// Checks for `scheme://host` patterns. Uses `url::Url::parse` for
|
||||
/// well-known schemes; falls back to `has_valid_scheme_prefix` for
|
||||
/// custom schemes that the `url` crate rejects.
|
||||
fn is_absolute_url_like(token: &str) -> bool {
|
||||
if !token.contains("://") {
|
||||
return false;
|
||||
}
|
||||
|
||||
if let Ok(url) = url::Url::parse(token) {
|
||||
let scheme = url.scheme().to_ascii_lowercase();
|
||||
if matches!(
|
||||
scheme.as_str(),
|
||||
"http" | "https" | "ftp" | "ftps" | "ws" | "wss"
|
||||
) {
|
||||
return url.host_str().is_some();
|
||||
}
|
||||
return true;
|
||||
}
|
||||
|
||||
has_valid_scheme_prefix(token)
|
||||
}
|
||||
|
||||
fn has_valid_scheme_prefix(token: &str) -> bool {
|
||||
let Some((scheme, rest)) = token.split_once("://") else {
|
||||
return false;
|
||||
};
|
||||
if scheme.is_empty() || rest.is_empty() {
|
||||
return false;
|
||||
}
|
||||
|
||||
let mut chars = scheme.chars();
|
||||
let Some(first) = chars.next() else {
|
||||
return false;
|
||||
};
|
||||
first.is_ascii_alphabetic()
|
||||
&& chars.all(|c| c.is_ascii_alphanumeric() || c == '+' || c == '-' || c == '.')
|
||||
}
|
||||
|
||||
/// Checks for bare-domain URLs without a scheme: `host[:port]/path`,
|
||||
/// `host[:port]?query`, or `host[:port]#fragment`.
|
||||
///
|
||||
/// Requires that the host is `localhost`, an IPv4 address, or a valid
|
||||
/// domain name. Bare `host.tld` without a path/query/fragment is only
|
||||
/// accepted when the host starts with `www.`.
|
||||
///
|
||||
/// IPv6 bracket notation (`[::1]:8080`) is intentionally not handled.
|
||||
fn is_bare_url_like(token: &str) -> bool {
|
||||
let (host_port, has_trailer) = split_host_port_and_trailer(token);
|
||||
if host_port.is_empty() {
|
||||
return false;
|
||||
}
|
||||
|
||||
// Require URL-ish trailer for bare hosts unless token starts with www.
|
||||
if !has_trailer && !host_port.to_ascii_lowercase().starts_with("www.") {
|
||||
return false;
|
||||
}
|
||||
|
||||
let (host, port) = split_host_and_port(host_port);
|
||||
if host.is_empty() {
|
||||
return false;
|
||||
}
|
||||
if let Some(port) = port
|
||||
&& !is_valid_port(port)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
|
||||
host.eq_ignore_ascii_case("localhost") || is_ipv4(host) || is_domain_name(host)
|
||||
}
|
||||
|
||||
fn split_host_port_and_trailer(token: &str) -> (&str, bool) {
|
||||
if let Some(idx) = token.find(['/', '?', '#']) {
|
||||
(&token[..idx], true)
|
||||
} else {
|
||||
(token, false)
|
||||
}
|
||||
}
|
||||
|
||||
fn split_host_and_port(host_port: &str) -> (&str, Option<&str>) {
|
||||
// We intentionally do not treat bracketed IPv6 as URL-like in this first pass.
|
||||
if host_port.starts_with('[') {
|
||||
return (host_port, None);
|
||||
}
|
||||
|
||||
if let Some((host, port)) = host_port.rsplit_once(':')
|
||||
&& !host.is_empty()
|
||||
&& !port.is_empty()
|
||||
&& port.chars().all(|c| c.is_ascii_digit())
|
||||
{
|
||||
return (host, Some(port));
|
||||
}
|
||||
|
||||
(host_port, None)
|
||||
}
|
||||
|
||||
fn is_valid_port(port: &str) -> bool {
|
||||
if port.is_empty() || port.len() > 5 || !port.chars().all(|c| c.is_ascii_digit()) {
|
||||
return false;
|
||||
}
|
||||
|
||||
port.parse::<u16>().is_ok()
|
||||
}
|
||||
|
||||
fn is_ipv4(host: &str) -> bool {
|
||||
let parts: Vec<&str> = host.split('.').collect();
|
||||
if parts.len() != 4 {
|
||||
return false;
|
||||
}
|
||||
|
||||
parts
|
||||
.iter()
|
||||
.all(|part| !part.is_empty() && part.parse::<u8>().is_ok())
|
||||
}
|
||||
|
||||
fn is_domain_name(host: &str) -> bool {
|
||||
let host = host.to_ascii_lowercase();
|
||||
if !host.contains('.') {
|
||||
return false;
|
||||
}
|
||||
|
||||
let mut labels = host.split('.');
|
||||
let Some(tld) = labels.next_back() else {
|
||||
return false;
|
||||
};
|
||||
if !is_tld(tld) {
|
||||
return false;
|
||||
}
|
||||
|
||||
labels.all(is_domain_label)
|
||||
}
|
||||
|
||||
fn is_tld(label: &str) -> bool {
|
||||
(2..=63).contains(&label.len()) && label.chars().all(|c| c.is_ascii_alphabetic())
|
||||
}
|
||||
|
||||
fn is_domain_label(label: &str) -> bool {
|
||||
if label.is_empty() || label.len() > 63 {
|
||||
return false;
|
||||
}
|
||||
|
||||
let mut chars = label.chars();
|
||||
let Some(first) = chars.next() else {
|
||||
return false;
|
||||
};
|
||||
let Some(last) = label.chars().next_back() else {
|
||||
return false;
|
||||
};
|
||||
|
||||
first.is_ascii_alphanumeric()
|
||||
&& last.is_ascii_alphanumeric()
|
||||
&& label.chars().all(|c| c.is_ascii_alphanumeric() || c == '-')
|
||||
}
|
||||
|
||||
/// Reconfigures wrapping options so that URL-like tokens are never split.
|
||||
///
|
||||
/// Sets `AsciiSpace` word separation (so `/` and `-` inside URLs are
|
||||
/// not treated as break points), disables `break_words`, and installs a
|
||||
/// custom `WordSplitter` that returns no split points for URL tokens
|
||||
/// while still allowing character-level splitting for non-URL words.
|
||||
pub(crate) fn url_preserving_wrap_options<'a>(opts: RtOptions<'a>) -> RtOptions<'a> {
|
||||
opts.word_separator(textwrap::WordSeparator::AsciiSpace)
|
||||
.word_splitter(textwrap::WordSplitter::Custom(split_non_url_word))
|
||||
.break_words(false)
|
||||
}
|
||||
|
||||
/// Custom `textwrap::WordSplitter` callback. Returns empty (no split
|
||||
/// points) for URL-like tokens so they are kept intact; returns every
|
||||
/// char-boundary index for everything else so non-URL words can still
|
||||
/// break at any position.
|
||||
fn split_non_url_word(word: &str) -> Vec<usize> {
|
||||
if is_url_like_token(word) {
|
||||
return Vec::new();
|
||||
}
|
||||
|
||||
word.char_indices().skip(1).map(|(idx, _)| idx).collect()
|
||||
}
|
||||
|
||||
/// Wraps a single ratatui `Line`, automatically switching to
|
||||
/// URL-preserving options when the line contains a URL-like token.
|
||||
///
|
||||
/// When no URL is detected, wrapping behavior is identical to
|
||||
/// [`word_wrap_line`]. When a URL is detected, the line is wrapped with
|
||||
/// [`url_preserving_wrap_options`] — URLs stay intact while non-URL
|
||||
/// words on the same line still break normally.
|
||||
#[must_use]
|
||||
pub(crate) fn adaptive_wrap_line<'a>(line: &'a Line<'a>, base: RtOptions<'a>) -> Vec<Line<'a>> {
|
||||
let selected = if line_contains_url_like(line) {
|
||||
url_preserving_wrap_options(base)
|
||||
} else {
|
||||
base
|
||||
};
|
||||
word_wrap_line(line, selected)
|
||||
}
|
||||
|
||||
/// Wraps multiple input lines with URL-aware heuristics, applying
|
||||
/// `initial_indent` to the first line and `subsequent_indent` to the
|
||||
/// rest. Each line is independently checked for URLs; URL detection on
|
||||
/// one line does not affect wrapping of the others.
|
||||
///
|
||||
/// This is the multi-line counterpart to [`adaptive_wrap_line`] and is
|
||||
/// the primary wrapping entry point for most history-cell rendering.
|
||||
#[allow(private_bounds)]
|
||||
pub(crate) fn adaptive_wrap_lines<'a, I, L>(
|
||||
lines: I,
|
||||
width_or_options: RtOptions<'a>,
|
||||
) -> Vec<Line<'static>>
|
||||
where
|
||||
I: IntoIterator<Item = L>,
|
||||
L: IntoLineInput<'a>,
|
||||
{
|
||||
let base_opts = width_or_options;
|
||||
let mut out: Vec<Line<'static>> = Vec::new();
|
||||
|
||||
for (idx, line) in lines.into_iter().enumerate() {
|
||||
let line_input = line.into_line_input();
|
||||
let opts = if idx == 0 {
|
||||
base_opts.clone()
|
||||
} else {
|
||||
base_opts
|
||||
.clone()
|
||||
.initial_indent(base_opts.subsequent_indent.clone())
|
||||
};
|
||||
|
||||
let wrapped = adaptive_wrap_line(line_input.as_ref(), opts);
|
||||
push_owned_lines(&wrapped, &mut out);
|
||||
}
|
||||
|
||||
out
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct RtOptions<'a> {
|
||||
/// The width in columns at which the text will be wrapped.
|
||||
@@ -644,4 +1092,162 @@ the kindness of the woman who tended
|
||||
them."#
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn ascii_space_separator_with_no_hyphenation_keeps_url_intact() {
|
||||
let line = Line::from(
|
||||
"http://example.com/long-url-with-dashes-wider-than-terminal-window/blah-blah-blah-text/more-gibberish-text",
|
||||
);
|
||||
let opts = RtOptions::new(24)
|
||||
.word_separator(textwrap::WordSeparator::AsciiSpace)
|
||||
.word_splitter(textwrap::WordSplitter::NoHyphenation)
|
||||
.break_words(false);
|
||||
|
||||
let out = word_wrap_line(&line, opts);
|
||||
|
||||
assert_eq!(out.len(), 1);
|
||||
assert_eq!(
|
||||
concat_line(&out[0]),
|
||||
"http://example.com/long-url-with-dashes-wider-than-terminal-window/blah-blah-blah-text/more-gibberish-text"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn text_contains_url_like_matches_expected_tokens() {
|
||||
let positives = [
|
||||
"https://example.com/a/b",
|
||||
"ftp://host/path",
|
||||
"www.example.com/path?x=1",
|
||||
"example.test/path#frag",
|
||||
"localhost:3000/api",
|
||||
"127.0.0.1:8080/health",
|
||||
"(https://example.com/wrapped-in-parens)",
|
||||
];
|
||||
|
||||
for text in positives {
|
||||
assert!(
|
||||
text_contains_url_like(text),
|
||||
"expected URL-like match for {text:?}"
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn text_contains_url_like_rejects_non_urls() {
|
||||
let negatives = [
|
||||
"src/main.rs",
|
||||
"foo/bar",
|
||||
"key:value",
|
||||
"just-some-text-with-dashes",
|
||||
"hello.world", // no path/query/fragment and no www
|
||||
];
|
||||
|
||||
for text in negatives {
|
||||
assert!(
|
||||
!text_contains_url_like(text),
|
||||
"did not expect URL-like match for {text:?}"
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn line_contains_url_like_checks_across_spans() {
|
||||
let line = Line::from(vec![
|
||||
"see ".into(),
|
||||
"https://example.com/a/very/long/path".cyan(),
|
||||
" for details".into(),
|
||||
]);
|
||||
|
||||
assert!(line_contains_url_like(&line));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn line_has_mixed_url_and_non_url_tokens_detects_prose_plus_url() {
|
||||
let line = Line::from("see https://example.com/path for details");
|
||||
assert!(line_has_mixed_url_and_non_url_tokens(&line));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn line_has_mixed_url_and_non_url_tokens_ignores_pipe_prefix() {
|
||||
let line = Line::from(vec![" │ ".into(), "https://example.com/path".into()]);
|
||||
assert!(!line_has_mixed_url_and_non_url_tokens(&line));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn line_has_mixed_url_and_non_url_tokens_ignores_ordered_list_marker() {
|
||||
let line = Line::from("1. https://example.com/path");
|
||||
assert!(!line_has_mixed_url_and_non_url_tokens(&line));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn text_contains_url_like_accepts_custom_scheme_with_separator() {
|
||||
assert!(text_contains_url_like("myapp://open/some/path"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn text_contains_url_like_rejects_invalid_ports() {
|
||||
assert!(!text_contains_url_like("localhost:99999/path"));
|
||||
assert!(!text_contains_url_like("example.com:abc/path"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn adaptive_wrap_line_keeps_long_url_like_token_intact() {
|
||||
let line = Line::from("example.test/a-very-long-path-with-many-segments-and-query?x=1&y=2");
|
||||
let out = adaptive_wrap_line(&line, RtOptions::new(20));
|
||||
assert_eq!(out.len(), 1);
|
||||
assert_eq!(
|
||||
concat_line(&out[0]),
|
||||
"example.test/a-very-long-path-with-many-segments-and-query?x=1&y=2"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn adaptive_wrap_line_preserves_default_behavior_for_non_url_tokens() {
|
||||
let line = Line::from("a_very_long_token_without_spaces_to_force_wrapping");
|
||||
let out = adaptive_wrap_line(&line, RtOptions::new(20));
|
||||
assert!(
|
||||
out.len() > 1,
|
||||
"expected non-url token to wrap with default options"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn adaptive_wrap_line_mixed_line_wraps_long_non_url_token() {
|
||||
let long_non_url = "a_very_long_token_without_spaces_to_force_wrapping";
|
||||
let line = Line::from(format!("see https://ex.com {long_non_url}"));
|
||||
let out = adaptive_wrap_line(&line, RtOptions::new(24));
|
||||
|
||||
assert!(
|
||||
out.iter()
|
||||
.any(|line| concat_line(line).contains("https://ex.com")),
|
||||
"expected URL token to remain present, got: {out:?}"
|
||||
);
|
||||
assert!(
|
||||
!out.iter()
|
||||
.any(|line| concat_line(line).contains(long_non_url)),
|
||||
"expected long non-url token to wrap on mixed lines, got: {out:?}"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn wrap_ranges_trim_handles_owned_lines_with_penalty_char() {
|
||||
fn split_every_char(word: &str) -> Vec<usize> {
|
||||
word.char_indices().skip(1).map(|(idx, _)| idx).collect()
|
||||
}
|
||||
|
||||
let text = "a_very_long_token_without_spaces";
|
||||
let opts = Options::new(8)
|
||||
.word_separator(textwrap::WordSeparator::AsciiSpace)
|
||||
.word_splitter(textwrap::WordSplitter::Custom(split_every_char))
|
||||
.break_words(false);
|
||||
|
||||
let ranges = wrap_ranges_trim(text, opts);
|
||||
let rebuilt = ranges
|
||||
.iter()
|
||||
.map(|range| &text[range.clone()])
|
||||
.collect::<String>();
|
||||
|
||||
assert_eq!(rebuilt, text);
|
||||
assert!(ranges.len() > 1, "expected wrapped ranges, got: {ranges:?}");
|
||||
}
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user