Processed DOM
conatus.utils.browser.dom.processed
¶
Processed DOM type.
🐴 Our DOM workhorse: As explained on our Browser concept page, ProcessedDOM is the class you
will want to use if you want to build actions that leverage the DOM in its
entirety.
- The other workhorse is
DOMNode, which is the class you will want to use if you want to build actions that leverage individual nodes in the DOM.
- This class is not the raw representation of the DOM that we get from Chrome, but a cleaned-up version of it.
- Or to put it more visually:

Two ways to get a ProcessedDOM object¶
You can get a ProcessedDOM object either:
- from a
ChromeDOMobject (from_chrome_dom)
- or from a
Pageobject (from_page).*
Pagecan be either a Playwright Page or a ConatusPage.
from conatus.utils.browser.dom.processed import ProcessedDOM
# Example 1: From a Page object
# from conatus.utils.browser import Browser
# url = "https://example.com"
# browser = Browser()
# browser.goto(url)
# page = browser.page
# processed_dom = ProcessedDOM.from_page(page)
# Example 2: From a ChromeDOM object
from conatus.utils.browser.dom.fixtures import example_chrome_dom_inputtypes
inputtypes_dom = example_chrome_dom_inputtypes()
width = 1100
processed_dom = ProcessedDOM.from_chrome_dom(inputtypes_dom, width)
assert processed_dom.page_title == "Input Type Sandbox"
Additional references¶
- API: Chrome DOM: The Chrome DOM classes, which this class is derived from.
- API: DOM nodes: The
DOMNodeclass, which you will want to use in conjunction with this class.
ProcessedDOM
¶
Bases: BaseModel
Processed DOM type.
| ATTRIBUTE | DESCRIPTION |
|---|---|
nodes |
Nodes in the DOM. |
root_node |
Root node of the DOM.
TYPE:
|
input_elements |
Input elements in the DOM. |
clickable_elements |
Clickable elements in the DOM. |
elements_count |
Number of elements that are either clickable or inputable in the DOM.
TYPE:
|
page_title |
Title of the web page.
TYPE:
|
page_url |
URL of the web page.
TYPE:
|
scroll_position |
Scroll position of the page. |
get_interactive_elements
staticmethod
¶
get_interactive_elements(
nodes: list[DOMNode], scroll_position: tuple[int, int]
) -> tuple[dict[int, DOMNode], dict[int, DOMNode], int]
Get interactive elements from a list of DOM nodes.
Interactive elements are elements that can be clicked or inputted into. Under the hood, we call a recursive function to find these elements.
from conatus.utils.browser.dom.fixtures import (
example_chrome_dom_inputtypes
)
from conatus.utils.browser.dom.processed import ProcessedDOM
inputtypes_dom = example_chrome_dom_inputtypes()
width = 1100
nodes = inputtypes_dom.process_nodes(width)
device_pixel_ratio = inputtypes_dom.get_device_pixel_ratio(width)
scroll_position = (
int(inputtypes_dom.document.scroll_offset_x / device_pixel_ratio),
int(inputtypes_dom.document.scroll_offset_y / device_pixel_ratio),
)
input_elements, clickable_elements, elements_count = (
ProcessedDOM.get_interactive_elements(nodes, scroll_position)
)
assert len(input_elements) == 2
assert len(clickable_elements) == 6
assert elements_count == 6
| PARAMETER | DESCRIPTION |
|---|---|
nodes
|
The list of DOM nodes. |
scroll_position
|
The scroll position of the page. |
| RETURNS | DESCRIPTION |
|---|---|
dict[int, DOMNode]
|
|
dict[int, DOMNode]
|
|
int
|
|
| RAISES | DESCRIPTION |
|---|---|
ValueError
|
If there are no bounds in the root node (which probably indicates a bug on Chrome's side.) |
Source code in conatus/utils/browser/dom/processed.py
from_chrome_dom
staticmethod
¶
from_chrome_dom(
chrome_dom: ChromeDOM, width: float
) -> ProcessedDOM
Create a ProcessedDOM object from a ChromeDOM object.
from conatus.utils.browser.dom.fixtures import (
example_chrome_dom_inputtypes
)
from conatus.utils.browser.dom.processed import ProcessedDOM
inputtypes_dom = example_chrome_dom_inputtypes()
inputtypes_dom_width = 1100
processed_dom = ProcessedDOM.from_chrome_dom(
inputtypes_dom, inputtypes_dom_width
)
assert processed_dom.page_title == "Input Type Sandbox"
assert len(processed_dom.input_elements) == 2
assert len(processed_dom.clickable_elements) == 6
assert processed_dom.elements_count == 6
| PARAMETER | DESCRIPTION |
|---|---|
chrome_dom
|
The
TYPE:
|
width
|
The width of the page. We need it to calculate the bounds of the nodes.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
ProcessedDOM
|
The |
Source code in conatus/utils/browser/dom/processed.py
from_page_async
async
classmethod
¶
from_page_async(page: Page | Page) -> ProcessedDOM
Create a ProcessedDOM from a Page (either Playwright or Conatus).
Note: The expected type for a Playwright page is
playwright.async_api._generated.Page.
| PARAMETER | DESCRIPTION |
|---|---|
page
|
The Page object (either a Playwright Page or a Conatus Page).
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
ProcessedDOM
|
The |
Source code in conatus/utils/browser/dom/processed.py
from_page
classmethod
¶
from_page(page: Page | Page) -> ProcessedDOM
Create a ProcessedDOM from a Page (either Playwright or Conatus).
from conatus.utils.browser import Browser
from conatus.utils.browser.dom.processed import ProcessedDOM
url = "https://example.com"
browser = Browser()
browser.goto(url)
page = browser.page
processed_dom = ProcessedDOM.from_page(page)
| PARAMETER | DESCRIPTION |
|---|---|
page
|
The Page object (either a Playwright Page or a Conatus Page).
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
ProcessedDOM
|
The |