Title: | Assess Image Alt Text on a Web Page |
---|---|
Description: | Scrape image element attributes from a webpage, detect alternative (alt) text and assess it with simple heuristics. Alt text is important for users of assistive technologies, like screen readers, for understanding the content of images. This package should be used in conjunction with other accessibility assessment tools for more comprehensive coverage. |
Authors: | Matt Dray [aut, cre] |
Maintainer: | Matt Dray <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.0 |
Built: | 2024-10-29 04:31:56 UTC |
Source: | https://github.com/matt-dray/altcheckr |
Infer whether an image's alt tag is missing or could be improved.
alt_check( attributes_df, max_char = 125, min_char = 20, file_ext = ".jpg$|.jpeg$|.png$|.svg$|.gif$", redundant_pattern = "image|picture|photo|graph|plot|diagram" )
alt_check( attributes_df, max_char = 125, min_char = 20, file_ext = ".jpg$|.jpeg$|.png$|.svg$|.gif$", redundant_pattern = "image|picture|photo|graph|plot|diagram" )
attributes_df |
A data.frame/tibble with image attributes, produced by
|
max_char |
A numeric value. Alt text longer than this is flagged. |
min_char |
A numeric value. Alt text shorter than this is flagged. |
file_ext |
A character string. A regular expression of image file extensions. |
redundant_pattern |
A character string. A regular expression of 'redundant' phrases in alt text. |
A tibble object with classes tbl_df
, tbl
and
data.frame
. In addition to columns provided by
alt_get
, also returns:
alt_exists
"Exists", "Missing" or intentionally "Empty".
nchar_count
Integer. Number of characters.
nchar_assess
"Short", "Long" or "OK", depending on inputs
to min_char
and max_char
.
file_ext
Logical. Does it look like the alt text might
just be a filename (e.g. ends with ".jpg")?
self_evident
Logical. Is a redundant phrase used in the
alt text (e.g. "a picture of")?
terminal_punct
Logical. Does the alt text end with
terminal punctuation, like a period, to allow a screen reader to
parse it as a sentence?
spellcheck
Character. Words to check for misspelling.
These could be misread by a screenreader.
not_basic
Character list column. Words that match to
the 850 words of Charles Kay Ogden's Basic English.
## Not run: url <- "https://alphagov.github.io/accessibility-tool-audit/test-cases.html#images" attr_df <- alt_get(url) alt_check(attr_df) ## End(Not run)
## Not run: url <- "https://alphagov.github.io/accessibility-tool-audit/test-cases.html#images" attr_df <- alt_get(url) alt_check(attr_df) ## End(Not run)
Scrape (politely) a web page's HTML and extract the attributes
from each image element (<img>
) on that page.
alt_get(webpage, all_attributes = FALSE)
alt_get(webpage, all_attributes = FALSE)
webpage |
A character string describing the URL of a webpage. Must be a
full path including |
all_attributes |
Logical. Do you want to return all the image
attributes as columns or just |
A tibble object with classes tbl_df
, tbl
and
data.frame
. One row per image element and one column per attribute,
unless (all_attributes = FALSE
), which returns only:
src
The source of the image as a filepath.
alt
The alternative (alt) text for the image.
longdesc
The path to a file that contains a longer
description for complex images (only returned if present).
## Not run: test_url <- "https://www.bbc.co.uk/news" alt_get(webpage = test_url) ## End(Not run)
## Not run: test_url <- "https://www.bbc.co.uk/news" alt_get(webpage = test_url) ## End(Not run)
A dummy data frame to mimic output from alt_get
.
example_get
example_get
A data frame with 11 rows and 2 variables:
a fake image source
dummy alt text with different types of problems
...